AITopics | multilingual system

Collaborating Authors

multilingual system

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Mitsubishi Electric develops multilingual translation system for meetings

The Japan TimesSep-10-2024, 12:07:00 GMT

Mitsubishi Electric said Tuesday that it has developed a prototype for a multilingual system that shows words on a screen in different languages. The Japanese company hopes that the system will be used on occasions such as morning assembly meetings at factories where information needs to be related accurately to a large number of workers, including non-Japanese ones. Mitsubishi Electric aims to put the system into commercial use as early as fiscal 2025, which begins in April next year. The company also expects the system to be used for tourism purposes. The system translates a prepared script written in Japanese into 17 other languages, with the screen showing sentences in four languages, including original Japanese sentences, at once.

electric develop multilingual translation system, multilingual system

The Japan Times

Country: Asia > Japan > Honshū > Kantō > Gunma Prefecture (0.09)

Industry: Automobiles & Trucks > Manufacturer (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.79)

Add feedback

Findings of the Covid-19 MLIA Machine Translation Task

Casacuberta, Francisco, Ceausu, Alexandru, Choukri, Khalid, Deligiannis, Miltos, Domingo, Miguel, García-Martínez, Mercedes, Herranz, Manuel, Jacquet, Guillaume, Papavassiliou, Vassilis, Piperidis, Stelios, Prokopidis, Prokopis, Roussis, Dimitris, Salah, Marwa Hadj

arXiv.org Artificial IntelligenceNov-14-2022

This work presents the results of the machine translation (MT) task from the Covid-19 MLIA @ Eval initiative, a community effort to improve the generation of MT systems focused on the current Covid-19 crisis. Nine teams took part in this event, which was divided in two rounds and involved seven different language pairs. Two different scenarios were considered: one in which only the provided data was allowed, and a second one in which the use of external resources was allowed. Overall, best approaches were based on multilingual models and transfer learning, with an emphasis on the importance of applying a cleaning process to the training data.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2211.07465

Country: Europe (0.68)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

DuDe: Dual-Decoder Multilingual ASR for Indian Languages using Common Label Set

A, Arunkumar, Batra, Mudit, S, Umesh

arXiv.org Artificial IntelligenceOct-30-2022

In a multilingual country like India, multilingual Automatic Speech Recognition (ASR) systems have much scope. Multilingual ASR systems exhibit many advantages like scalability, maintainability, and improved performance over the monolingual ASR systems. However, building multilingual systems for Indian languages is challenging since different languages use different scripts for writing. On the other hand, Indian languages share a lot of common sounds. Common Label Set (CLS) exploits this idea and maps graphemes of various languages with similar sounds to common labels. Since Indian languages are mostly phonetic, building a parser to convert from native script to CLS is easy. In this paper, we explore various approaches to build multilingual ASR models. We also propose a novel architecture called Encoder-Decoder-Decoder for building multilingual systems that use both CLS and native script labels. We also analyzed the effectiveness of CLS-based multilingual systems combined with machine transliteration.

indian language, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2210.16739

Country: Asia > India (0.26)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.55)

Add feedback

Improving Multilingual Translation by Representation and Gradient Regularization

Yang, Yilin, Eriguchi, Akiko, Muzio, Alexandre, Tadepalli, Prasad, Lee, Stefan, Hassan, Hany

arXiv.org Artificial IntelligenceSep-10-2021

Multilingual Neural Machine Translation (NMT) enables one model to serve all translation directions, including ones that are unseen during training, i.e. zero-shot translation. Despite being theoretically attractive, current models often produce low quality translations -- commonly failing to even produce outputs in the right target language. In this work, we observe that off-target translation is dominant even in strong multilingual systems, trained on massive multilingual corpora. To address this issue, we propose a joint approach to regularize NMT models at both representation-level and gradient-level. At the representation level, we leverage an auxiliary target language prediction task to regularize decoder outputs to retain information about the target language. At the gradient level, we leverage a small amount of direct data (in thousands of sentence pairs) to regularize model gradients. Our results demonstrate that our approach is highly effective in both reducing off-target translation occurrences and improving zero-shot translation performance by +5.59 and +10.38 BLEU on WMT and OPUS datasets respectively. Moreover, experiments show that our method also works well when the small amount of direct data is not available.

arxiv preprint arxiv, oracle data, translation, (14 more...)

arXiv.org Artificial Intelligence

2109.04778

Country: North America > United States > Oregon (0.04)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

A Simple and Effective Method To Eliminate the Self Language Bias in Multilingual Representations

Yang, Ziyi, Yang, Yinfei, Cer, Daniel, Darve, Eric

arXiv.org Artificial IntelligenceSep-10-2021

Language agnostic and semantic-language information isolation is an emerging research direction for multilingual representations models. We explore this problem from a novel angle of geometric algebra and semantic space. A simple but highly effective method "Language Information Removal (LIR)" factors out language identity information from semantic related components in multilingual representations pre-trained on multi-monolingual data. A post-training and model-agnostic method, LIR only uses simple linear operations, e.g. matrix factorization and orthogonal projection. LIR reveals that for weak-alignment multilingual systems, the principal components of semantic spaces primarily encodes language identity information. We first evaluate the LIR on a cross-lingual question answer retrieval task (LAReQA), which requires the strong alignment for the multilingual embedding space. Experiment shows that LIR is highly effectively on this task, yielding almost 100% relative improvement in MAP for weak-alignment models. We then evaluate the LIR on Amazon Reviews and XEVAL dataset, with the observation that removing language information is able to improve the cross-lingual transfer performance.

information, proceedings, representation, (14 more...)

arXiv.org Artificial Intelligence

2109.04727

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Italy > Tuscany > Florence (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.69)

Add feedback

Large-Scale Multilingual Speech Recognition with a Streaming End-to-End Model

Kannan, Anjuli, Datta, Arindrima, Sainath, Tara N., Weinstein, Eugene, Ramabhadran, Bhuvana, Wu, Yonghui, Bapna, Ankur, Chen, Zhifeng, Lee, Seungji

arXiv.org Machine LearningSep-11-2019

Multilingual end-to-end (E2E) models have shown great promise in expansion of automatic speech recognition (ASR) coverage of the world's languages. They have shown improvement over monolingual systems, and have simplified training and serving by eliminating language-specific acoustic, pronunciation, and language models. This work presents an E2E multilingual system which is equipped to operate in low-latency interactive applications, as well as handle a key challenge of real world data: the imbalance in training data across languages. Using nine Indic languages, we compare a variety of techniques, and find that a combination of conditioning on a language vector and training language-specific adapter layers produces the best model. The resulting E2E multilingual model achieves a lower word error rate (WER) than both monolingual E2E models (eight of nine languages) and monolingual conventional systems (all nine languages). Index T erms: speech recognition, multilingual, RNN-T, residual adapter 1. Introduction Automatic speech recognition (ASR) systems that can transcribe speech in multiple languages, known as multilingual models, have gained popularity as an effective way to expand ASR coverage of the world's languages. Through shared learning of model elements across languages, they have been shown to outperform monolingual systems, particularly for those languages with less data.

language vector, multilingual model, speech recognition, (13 more...)

arXiv.org Machine Learning

1909.0533

Country:

Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
Asia > India > Karnataka > Bengaluru (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

Google's Artifical Intelligence Has Reinvented The Master Language

#artificialintelligenceJan-14-2017, 18:35:29 GMT

In the closing weeks of 2016, Google published an article that quietly sailed under most people's radars. Which is a shame, because it may just be the most astonishing article about machine learning that I read last year. Don't feel bad if you missed it. Not only was the article competing with the pre-Christmas rush that most of us were navigating -- it was also tucked away on Google's Research Blog, beneath the geektastic headline Zero-Shot Translation with Google's Multilingual Neural Machine Translation System. This doesn't exactly scream must read, does it?

artificial intelligence, natural language, translation, (13 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback