AITopics | language expansion

Collaborating Authors

language expansion

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Language-Aware Prompt Tuning for Parameter-Efficient Seamless Language Expansion in Multilingual ASR

Yang, Hongli, Li, Sheng, Huang, Hao, Tuohan, Ayiduosi, Peng, Yizhou

arXiv.org Artificial IntelligenceSep-29-2025

Recent advancements in multilingual automatic speech recognition (ASR) have been driven by large-scale end-to-end models like Whisper. However, challenges such as language interference and expanding to unseen languages (language expansion) without degrading performance persist. This paper addresses these with three contributions: 1) Entire Soft Prompt Tuning (Entire SPT), which applies soft prompts to both the encoder and decoder, enhancing feature extraction and decoding; 2) Language-A ware Prompt Tuning (LAPT), which leverages cross-lingual similarities to encode shared and language-specific features using lightweight prompt matrices; 3) SPT - Whisper, a toolkit that integrates SPT into Whisper and enables efficient continual learning. Experiments across three languages from FLEURS demonstrate that Entire SPT and LAPT outperform Decoder SPT by 5.0% and 16.0% in language expansion tasks, respectively, providing an efficient solution for dynamic, multilingual ASR models with minimal computational overhead.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.21437/Interspeech.2025-1875

2506.21577

Country:

North America > United States (0.28)
Asia > Japan (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Task Arithmetic for Language Expansion in Speech Translation

Cheng, Yao-Fei, Futami, Hayato, Kashiwagi, Yosuke, Tsunoo, Emiru, Teo, Wen Shen, Arora, Siddhant, Watanabe, Shinji

arXiv.org Artificial IntelligenceSep-17-2024

Recent advances in large language models (LLMs) have gained interest in speech-text multimodal foundation models, achieving strong performance on instruction-based speech translation (ST). However, expanding language pairs from an existing instruction-tuned ST system is costly due to the necessity of re-training on a combination of new and previous datasets. We propose to expand new language pairs by merging the model trained on new language pairs and the existing model, using task arithmetic. We find that the direct application of task arithmetic for ST causes the merged model to fail to follow instructions; thus, generating translation in incorrect languages. To eliminate language confusion, we propose an augmented task arithmetic method that merges an additional language control model. It is trained to generate the correct target language token following the instructions. Our experiments demonstrate that our proposed language control model can achieve language expansion by eliminating language confusion. In our MuST-C and CoVoST-2 experiments, it shows up to 4.66 and 4.92 BLEU scores improvement, respectively. In addition, we demonstrate the use of our task arithmetic framework can expand to a language pair where neither paired ST training data nor a pre-trained ST model is available. We first synthesize the ST system from machine translation (MT) systems via task analogy, then merge the synthesized ST system to the existing ST model.

language expansion, speech translation, task arithmetic

arXiv.org Artificial Intelligence

2409.11274

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

LoRA-Whisper: Parameter-Efficient and Extensible Multilingual ASR

Song, Zheshu, Zhuo, Jianheng, Yang, Yifan, Ma, Ziyang, Zhang, Shixiong, Chen, Xie

arXiv.org Artificial IntelligenceJun-7-2024

When new languages need to be integrated into a multilingual ASR system, a naive Recent years have witnessed significant progress in multilingual approach is to fine-tune the ASR model using data from these automatic speech recognition (ASR), driven by the emergence new languages. Unfortunately, this often results in catastrophic of end-to-end (E2E) models and the scaling of multilingual forgetting, referring to the phenomenon that the recognition performance datasets. Despite that, two main challenges persist in multilingual of base languages tends to decline. To solve the above ASR: language interference and the incorporation of problem, Li et al. [26] proposes lifelong learning [27] solution new languages without degrading the performance of the existing which remedies the language interference problem by mixing ones. This paper proposes LoRA-Whisper, which incorporates base language data and new language data. However, this approach LoRA matrix into Whisper for multilingual ASR, is inefficient and time-consuming. Libera et al. [28] explores effectively mitigating language interference. Furthermore, by various continual learning methods [29-34] to address leveraging LoRA and the similarities between languages, we the issue of catastrophic forgetting. While these approaches can achieve better performance on new languages while upholding have helped alleviate the problem, it still persists.

language expansion, new language, speech recognition, (10 more...)

arXiv.org Artificial Intelligence

2406.06619

Country:

Asia > China > Shanghai > Shanghai (0.05)
Asia > Singapore (0.05)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(7 more...)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.73)

Add feedback