AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

Natural language processing explained

#artificialintelligenceMay-29-2019, 13:53:26 GMT

Me: Alexa please remind me my morning yoga sculpt class is at 5:30am. Alexa: I have added Tequila to your shopping list. We talk to our devices, and sometimes they recognize what we are saying correctly. We use free services to translate foreign language phrases encountered online into English, and sometimes they give us an accurate translation. Although natural language processing has been improving by leaps and bounds, it still has considerable room for improvement.

artificial intelligence, machine learning, natural language, (16 more...)

#artificialintelligence

Industry:

Health & Medicine (0.35)
Information Technology > Services (0.31)
Education (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.32)

Add feedback

A Generalized Framework of Sequence Generation with Application to Undirected Sequence Models

Mansimov, Elman, Wang, Alex, Cho, Kyunghyun

arXiv.org Machine LearningMay-29-2019

Undirected neural sequence models such as BERT [Devlin et al., 2019] have received renewed interest due to their success on discriminative natural language understanding tasks such as question-answering and natural language inference. The problem of generating sequences directly from these models has received relatively little attention, in part because generating from such models departs significantly from the conventional approach of monotonic generation in directed sequence models. We investigate this problem by first proposing a generalized model of sequence generation that unifies decoding in directed and undirected models. The proposed framework models the process of generation rather than a resulting sequence, and under this framework, we derive various neural sequence models as special cases, such as autoregressive, semi-autoregressive, and refinement-based non-autoregressive models. This unification enables us to adapt decoding algorithms originally developed for directed sequence models to undirected models. We demonstrate this by evaluating various decoding strategies for the recently proposed cross-lingual masked translation model [Lample and Conneau, 2019]. Our experiments reveal that generation from undirected sequence models, under our framework, is competitive with the state of the art on WMT'14 English-German translation. We furthermore observe that the proposed approach enables constant-time translation while remaining within 1 BLEU score compared to linear-time translation from the same undirected neural sequence model.

artificial intelligence, natural language, sequence, (17 more...)

arXiv.org Machine Learning

1905.1279

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)
North America > United States > New York (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.95)

Add feedback

Adversarial Sub-sequence for Text Generation

Chen, Xingyuan, Li, Yanzhe, Jin, Peng, Zhang, Jiuhua, Dai, Xinyu, Chen, Jiajun, Song, Gang

arXiv.org Artificial IntelligenceMay-29-2019

Generative adversarial nets (GAN) has been successfully introduced for generating text to alleviate the exposure bias. However, discriminators in these models only evaluate the entire sequence, which causes feedback sparsity and mode collapse. To tackle these problems, we propose a novel mechanism. It first segments the entire sequence into several sub-sequences. Then these sub-sequences, together with the entire sequence, are evaluated individually by the discriminator. At last these feedback signals are all used to guide the learning of GAN. This mechanism learns the generation of both the entire sequence and the sub-sequences simultaneously. Learning to generate sub-sequences is easy and is helpful in generating an entire sequence. It is easy to improve the existing GAN-based models with this mechanism. We rebuild three previous well-designed models with our mechanism, and the experimental results on benchmark data show these models are improved significantly, the best one outperforms the state-of-the-art model.\footnote[1]{All code and data are available at https://github.com/liyzcj/seggan.git

machine learning, mechanism, natural language, (17 more...)

arXiv.org Artificial Intelligence

1905.12835

Country:

Europe > United Kingdom > Scotland (0.04)
Europe > Romania > București - Ilfov Development Region > Municipality of Bucharest > Bucharest (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.69)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.68)

Add feedback

Unsupervised Paraphrasing without Translation

Roy, Aurko, Grangier, David

arXiv.org Machine LearningMay-29-2019

Paraphrasing exemplifies the ability to abstract semantic content from surface forms. Recent work on automatic paraphrasing is dominated by methods leveraging Machine Translation (MT) as an intermediate step. This contrasts with humans, who can paraphrase without being bilingual. This work proposes to learn paraphrasing models from an unlabeled monolingual corpus only. To that end, we propose a residual variant of vector-quantized variational auto-encoder. We compare with MT-based approaches on paraphrase identification, generation, and training augmentation. Monolingual paraphrasing outperforms unsupervised translation in all settings. Comparisons with supervised translation are more mixed: monolingual paraphrasing is interesting for identification and augmentation; supervised translation is superior for generation.

machine learning, natural language, translation, (17 more...)

arXiv.org Machine Learning

1905.12752

Country: Asia > North Korea (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Stochastic Gradient Methods with Layer-wise Adaptive Moments for Training of Deep Networks

Ginsburg, Boris, Castonguay, Patrice, Hrinchuk, Oleksii, Kuchaiev, Oleksii, Lavrukhin, Vitaly, Leary, Ryan, Li, Jason, Nguyen, Huyen, Cohen, Jonathan M.

arXiv.org Machine LearningMay-27-2019

We propose NovoGrad, a first-order stochastic gradient method with layer-wise gradient normalization via second moment estimators and with decoupled weight decay for a better regularization. The method requires half as much memory as Adam/AdamW. We evaluated NovoGrad on a diverse set of problems, including image classification, speech recognition, neural machine translation and language modeling. On these problems, NovoGrad performed equal to or better than SGD and Adam/AdamW. Empirically we show that NovoGrad (1) is very robust during the initial training phase and does not require learning rate warm-up, (2) works well with the same learning rate policy for different problems, and (3) generally performs better than other optimizers for very large batch sizes.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

1905.11286

Country:

North America > United States (0.04)
Europe > Russia (0.04)
Asia > Russia (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.72)

Add feedback

XLDA: Cross-Lingual Data Augmentation for Natural Language Inference and Question Answering

Singh, Jasdeep, McCann, Bryan, Keskar, Nitish Shirish, Xiong, Caiming, Socher, Richard

arXiv.org Artificial IntelligenceMay-27-2019

While natural language processing systems often focus on a single language, multilingual transfer learning has the potential to improve performance, especially for low-resource languages. We introduce XLDA, cross-lingual data augmentation, a method that replaces a segment of the input text with its translation in another language. XLDA enhances performance of all 14 tested languages of the cross-lingual natural language inference (XNLI) benchmark. With improvements of up to $4.8\%$, training with XLDA achieves state-of-the-art performance for Greek, Turkish, and Urdu. XLDA is in contrast to, and performs markedly better than, a more naive approach that aggregates examples in various languages in a way that each example is solely in one language. On the SQuAD question answering task, we see that XLDA provides a $1.0\%$ performance increase on the English evaluation set. Comprehensive experiments suggest that most languages are effective as cross-lingual augmentors, that XLDA is robust to a wide range of translation quality, and that XLDA is even more effective for randomly initialized models than for pretrained models.

artificial intelligence, machine translation, natural language, (16 more...)

arXiv.org Artificial Intelligence

1905.11471

Genre: Research Report (1.00)

Industry: Leisure & Entertainment (0.47)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Microsoft Research Asia (MSRA) Leads in 2019 WMT International Machine Translation Competition

#artificialintelligenceMay-23-2019, 05:17:48 GMT

Microsoft Research Asia (MSRA) has achieved eight top places in the recent machine translation challenge organized by the 2019 fourth Conference on Machine Translation (WMT19), out of the eleven tasks it undertook. Overall, there are nineteen machine translation categories in WMT this year. MSRA achieved first place in machine translation tasks for Chinese-English, English-Finnish, English-German, English-Lithuanian, French-German, German-English, German-French and Russian-English. Three other tasks were placed second in their respective categories, which included English-Kazakh, Finnish-English and Lithuanian-English. As one of the leading machine translation competition globally, WMT is a platform for leading researchers to demonstrate their solutions, as well as to understand the continuous evolvement of machine translation technology. Now in its 14th year, more than 50 teams globally from technology companies, leading academic institutions and universities participated in a bid to demonstrate their machine translation capabilities.

artificial intelligence, microsoft research asia, natural language, (10 more...)

#artificialintelligence

Country: Asia (0.62)

Industry: Information Technology (0.60)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Google's AI can now translate your speech while keeping your voice

#artificialintelligenceMay-21-2019, 10:24:13 GMT

The new system, dubbed the Translatotron, has three components, all of which look at the speaker's audio spectrogram--a visual snapshot of the frequencies used when the sound is playing, often called a voiceprint. The first component uses a neural network trained to map the audio spectrogram in the input language to the audio spectrogram in the output language. The second converts the spectrogram into an audio wave that can be played.

artificial intelligence, machine learning, natural language, (5 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.35)

Add feedback

Google AI 'Translatotron' Can Make Anyone a Real-Time Polyglot

#artificialintelligenceMay-20-2019, 06:22:40 GMT

Google AI yesterday released its latest research result in speech-to-speech translation, the futuristic-sounding "Translatotron." Billed as the world's first end-to-end speech-to-speech translation model, Translatotron promises the potential for real-time cross-linguistic conversations with low latency and high accuracy. Humans have always dreamed of a voice-based device that could enable them to simply leap over language barriers. While advances in deep learning have contributed to highly improved accuracy in speech recognition and machine translation, smooth conversations between different language speakers remained hampered by unnatural pauses during machine processing. Google's wireless headphone Pixel Bud released in 2017 boasted real-time speech translation, but users found the practical experience less then satisfying.

machine learning, natural language, translatotron, (16 more...)

#artificialintelligence

Genre: Research Report > New Finding (0.38)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.38)

Add feedback

Amazing Google AI speaks another language in your voice

#artificialintelligenceMay-19-2019, 18:35:29 GMT

On Wednesday, Google unveiled Translatotron, an in-development speech-to-speech translation system. It's not the first system to translate speech from one language to another, but Google designed Translatotron to do something other systems can't: retain the original speaker's voice in the translated audio. In other words, the tech could make it sound like you're speaking a language you don't know -- a remarkable step forward on the path to breaking down the global language barrier. According to Google's AI blog, most speech-to-speech translation systems follow a three-step process. First they transcribe the speech.

amazing google ai speak, speech-to-speech translation system, translatotron, (6 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback