AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

Data Distribution Shifts and Monitoring

#artificialintelligenceApr-12-2022, 00:20:42 GMT

Note: This note is a work-in-progress, created for the course CS 329S: Machine Learning Systems Design (Stanford, 2022). For the fully developed text, see th...

degenerate feedback loop, feedback loop, prediction, (17 more...)

#artificialintelligence

Country:

North America > United States > California > San Francisco County > San Francisco (0.04)
North America > United States > Arizona (0.04)
Asia > China > Hubei Province > Wuhan (0.04)

Industry:

Law (1.00)
Health & Medicine > Therapeutic Area > Oncology (1.00)
Banking & Finance (0.68)
Information Technology > Services (0.67)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.46)

Add feedback

Towards Better Chinese-centric Neural Machine Translation for Low-resource Languages

Li, Bin, Weng, Yixuan, Xia, Fei, Deng, Hanjun

arXiv.org Artificial IntelligenceApr-8-2022

The last decade has witnessed enormous improvements in science and technology, stimulating the growing demand for economic and cultural exchanges in various countries. Building a neural machine translation (NMT) system has become an urgent trend, especially in the low-resource setting. However, recent work tends to study NMT systems for low-resource languages centered on English, while few works focus on low-resource NMT systems centered on other languages such as Chinese. To achieve this, the low-resource multilingual translation challenge of the 2021 iFL YTEK AI Developer Competition provides the Chinese-centric multilingual low-resource NMT tasks, where participants are required to build NMT systems based on the provided low-resource samples. In this paper, we present the winner competition system that leverages monolingual word embeddings data enhancement, bilingual curriculum learning, and contrastive re-ranking. In addition, a new Incomplete-Trust (In-trust) loss function is proposed to replace the traditional cross-entropy loss when training. The experimental results demonstrate that the implementation of these ideas leads better performance than other state-of-the-art methods. All the experimental codes are released at: https://github.com/WENGSYX/

machine learning, natural language, translation, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.csl.2023.101566

2204.04344

Country:

Asia > Indonesia (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Industry: Education (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Spinning Language Models: Risks of Propaganda-As-A-Service and Countermeasures

Bagdasaryan, Eugene, Shmatikov, Vitaly

arXiv.org Artificial IntelligenceApr-8-2022

We investigate a new threat to neural sequence-to-sequence (seq2seq) models: training-time attacks that cause models to "spin" their outputs so as to support an adversary-chosen sentiment or point of view -- but only when the input contains adversary-chosen trigger words. For example, a spinned summarization model outputs positive summaries of any text that mentions the name of some individual or organization. Model spinning introduces a "meta-backdoor" into a model. Whereas conventional backdoors cause models to produce incorrect outputs on inputs with the trigger, outputs of spinned models preserve context and maintain standard accuracy metrics, yet also satisfy a meta-task chosen by the adversary. Model spinning enables propaganda-as-a-service, where propaganda is defined as biased speech. An adversary can create customized language models that produce desired spins for chosen triggers, then deploy these models to generate disinformation (a platform attack), or else inject them into ML training pipelines (a supply-chain attack), transferring malicious functionality to downstream models trained by victims. To demonstrate the feasibility of model spinning, we develop a new backdooring technique. It stacks an adversarial meta-task onto a seq2seq model, backpropagates the desired meta-task output to points in the word-embedding space we call "pseudo-words," and uses pseudo-words to shift the entire output distribution of the seq2seq model. We evaluate this attack on language generation, summarization, and translation models with different triggers and meta-tasks such as sentiment, toxicity, and entailment. Spinned models largely maintain their accuracy metrics (ROUGE and BLEU) while shifting their outputs to satisfy the adversary's meta-task. We also show that, in the case of a supply-chain attack, the spin functionality transfers to downstream models.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/SP46214.2022.9833572

2112.05224

Country:

Europe > United Kingdom > Scotland > West Lothian (0.05)
Asia > Nepal (0.04)
North America > United States > North Carolina (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry:

Media > News (1.00)
Government > Voting & Elections (1.00)
Government > Regional Government > Europe Government > United Kingdom Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)

Add feedback

Neural Natural Language Generation: A Survey on Multilinguality, Multimodality, Controllability and Learning

Erdem, Erkut (Hacettepe University, Ankara, Turkey) | Kuyu, Menekse (Hacettepe University, Ankara, Turkey) | Yagcioglu, Semih (Hacettepe University, Ankara, Turkey) | Frank, Anette (Heidelberg University, Heidelberg, Germany) | Parcalabescu, Letitia (Heidelberg University, Heidelberg, Germany) | Plank, Barbara (IT University of Copenhagen, Copenhagen, Denmark) | Babii, Andrii (Kharkiv National University of Radio Electronics, Ukraine) | Turuta, Oleksii (Kharkiv National University of Radio Electronics, Ukraine) | Erdem, Aykut | Calixto, Iacer (New York University, U.S.A. / University of Amsterdam, Netherlands) | Lloret, Elena (University of Alicante, Alicante, Spain) | Apostol, Elena-Simona (University Politehnica of Bucharest, Bucharest, Romania) | Truică, Ciprian-Octavian (University Politehnica of Bucharest, Bucharest, Romania) | Šandrih, Branislava (University of Belgrade, Belgrade, Serbia) | Martinčić-Ipšić, Sanda (University of Rijeka, Rijeka, Croatia) | Berend, Gábor (University of Szeged, Szeged, Hungary) | Gatt, Albert (University of Malta, Malta) | Korvel, Grăzina (Vilnius University, Vilnius, Lithuania)

Journal of Artificial Intelligence ResearchApr-6-2022

Developing artificial learning systems that can understand and generate natural language has been one of the long-standing goals of artificial intelligence. Recent decades have witnessed an impressive progress on both of these problems, giving rise to a new family of approaches. Especially, the advances in deep learning over the past couple of years have led to neural approaches to natural language generation (NLG). These methods combine generative language learning techniques with neural-networks based frameworks. With a wide range of applications in natural language processing, neural NLG (NNLG) is a new and fast growing field of research. In this state-of-the-art report, we investigate the recent developments and applications of NNLG in its full extent from a multidimensional view, covering critical perspectives such as multimodality, multilinguality, controllability and learning strategies. We summarize the fundamental building blocks of NNLG approaches from these aspects and provide detailed reviews of commonly used preprocessing steps and basic neural architectures. This report also focuses on the seminal applications of these NNLG models such as machine translation, description generation, automatic speech recognition, abstractive summarization, text simplification, question answering and generation, and dialogue generation. Finally, we conclude with a thorough discussion of the described frameworks by pointing out some open research directions.

abstractive summarization, image description, text simplification and paraphrasing, (14 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.12918

AI Access Foundation

12918

Journal of Artificial Intelligence Research

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.14)
Europe > Portugal > Lisbon > Lisbon (0.14)
(44 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Health & Medicine (1.00)
Education > Curriculum > Subject-Specific Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Why a Cognitive AI Engine Is the Next Step in Accessibility and Inclusion

#artificialintelligenceApr-5-2022, 15:30:29 GMT

To foster the next level of accessibility and inclusion, it's time to start investing our efforts into developing more sophisticated cognitive AI machines. Developing more sophisticated forms of cognitive AI is the key to expanding global accessibility and broadening the scope of inclusion. In fact, we already see unprecedented language coverage. Flint Capital notes that recent research shows the number of machine translation language pairs has soared from 16,000 to about 100,000 in a single year. On top of this, Flint Capital also notes that the global cognitive computing market is projected to surge to $72.26 billion by 2027. We already see huge gains with the rapid development of new AI tech that pushes the existing limits of voice synthesis and speech recognition.

accessibility and inclusion, cognitive ai, cognitive ai engine, (8 more...)

#artificialintelligence

Industry: Health & Medicine (0.72)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.79)

Add feedback

8 Ways to Perform NLP Better in 2022

#artificialintelligenceApr-1-2022, 21:41:53 GMT

Machine translation (MT) has become ubiquitous as a technology that enables individuals to access content on-demand in real-time that is written in languages they do not speak. However, contrary to recent press releases that have said it has surpassed human quality, the results in practice suggest that it has a long way to go. One of the biggest challenges current-generation neural MT (NMT) faces is that its engines are not easily adaptable and cannot respond to context or extra-linguistic knowledge that human translators routinely deal with. In addition, NMT's improvements have largely been in terms of fluency (how natural the output sounds) rather than accuracy (how well the translated text represents the content of the source text). This discrepancy in improvement actually increases the risk that critical errors may remain undetected simply because they are readable and sound plausible. The next step forward is to build "responsive MT": systems that can take advantage of embedded metadata about a wide variety of topics and use them to preferentially use the most relevant training data.

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Teaching AI to translate 100s of spoken and written languages in real time

#artificialintelligenceMar-26-2022, 09:15:07 GMT

For people who understand languages like English, Mandarin, or Spanish, it may seem like today's apps and web tools already provide the translation technology we need. But billions of people are being left out -- unable to easily access most of the information on the internet or connect with most of the online world in their native language. Today's machine translation (MT) systems are improving rapidly, but they still rely heavily on learning from large amounts of textual data, so they do not generally work well for low-resource languages, i.e., languages that lack training data, and for languages that don't have a standardized writing system. Eliminating language barriers would be profound, making it possible for billions of people to access information online in their native or preferred languages. Advances in MT won't just help those people who don't speak one of the languages that dominates the internet today; they'll also fundamentally change the way people in the world connect and share ideas.

speech, translation, translation system, (16 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

AI will Power Machine Translation to New Heights in 2022 - Enterprise Viewpoint

#artificialintelligenceMar-24-2022, 07:15:28 GMT

Machine translation has been around for many years. However, it wasn't until Google, Microsoft and others began developing machine translation that it grew into a serious competitive alternative to human translation. As a result, machine translation has made more progress in the last 10 years than the previous 50 years. Today, machine translation is used to produce billions of words daily and is fast closing in on human translation quality. At the heart of the improvement in machine translation quality is artificial intelligence.

machine translation, translation, translation system, (12 more...)

#artificialintelligence

Industry: Information Technology (0.31)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

What You Never Knew About Attention Mechanisms

#artificialintelligenceMar-22-2022, 18:06:25 GMT

This blog is written and maintained by students in the Master of Science in Professional Computer Science Program at Simon Fraser University as part of their course credit. To learn more about this unique program, please visit {sfu.ca/computing/mpcs}. Where are your eyes drawn to in this photo? Most of us will admit that our eyes are drawn to the blue duckling. To humans, the blue duckling sticks out like a sore thumb.

attention mechanism, query, vector, (14 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.30)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback

Factual Consistency of Multilingual Pretrained Language Models

Fierro, Constanza, Søgaard, Anders

arXiv.org Artificial IntelligenceMar-22-2022

Pretrained language models can be queried for factual knowledge, with potential applications in knowledge base acquisition and tasks that require inference. However, for that, we need to know how reliable this knowledge is, and recent work has shown that monolingual English language models lack consistency when predicting factual knowledge, that is, they fill-in-the-blank differently for paraphrases describing the same fact. In this paper, we extend the analysis of consistency to a multilingual setting. We introduce a resource, mParaRel, and investigate (i) whether multilingual language models such as mBERT and XLM-R are more consistent than their monolingual counterparts; and (ii) if such models are equally consistent across languages. We find that mBERT is as inconsistent as English BERT in English paraphrases, but that both mBERT and XLM-R exhibit a high degree of inconsistency in English and even more so for all the other 45 languages.

computational linguistic, consistency, proceedings, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2022.findings-acl.240

2203.11552

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Hong Kong (0.04)
North America > United States > New York > New York County > New York City (0.04)
(7 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.89)

Add feedback