AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

AI and the Everything in the Whole Wide World Benchmark

Raji, Inioluwa Deborah, Bender, Emily M., Paullada, Amandalynne, Denton, Emily, Hanna, Alex

arXiv.org Artificial IntelligenceNov-26-2021

There is a tendency across different subfields in AI to valorize a small collection of influential benchmarks. These benchmarks operate as stand-ins for a range of anointed common problems that are frequently framed as foundational milestones on the path towards flexible and generalizable AI systems. State-of-the-art performance on these benchmarks is widely understood as indicative of progress towards these long-term goals. In this position paper, we explore the limits of such benchmarks in order to reveal the construct validity issues in their framing as the functionally "general" broad measures of progress they are set up to be.

benchmark, computational linguistic, dataset, (16 more...)

arXiv.org Artificial Intelligence

2111.15366

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(11 more...)

Genre:

Research Report (1.00)
Overview (0.93)

Industry:

Health & Medicine (1.00)
Leisure & Entertainment > Games (0.67)
Government > Military (0.67)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)
(2 more...)

Add feedback

Does constituency analysis enhance domain-specific pre-trained BERT models for relation extraction?

Tang, Anfu, Deléger, Louise, Bossy, Robert, Zweigenbaum, Pierre, Nédellec, Claire

arXiv.org Artificial IntelligenceNov-25-2021

Recently many studies have been conducted on the topic of relation extraction. The DrugProt track at BioCreative VII provides a manually-annotated corpus for the purpose of the development and evaluation of relation extraction systems, in which interactions between chemicals and genes are studied. We describe the ensemble system that we used for our submission, which combines predictions of fine-tuned bioBERT, sciBERT and const-bioBERT models by majority voting. We specifically tested the contribution of syntactic information to relation extraction with BERT. We observed that adding constituentbased syntactic information to BERT improved precision, but decreased recall, since relations rarely seen in the train set were less likely to be predicted by BERT models in which the syntactic information is infused. Our code is available online [https://github.com/Maple177/drugprot-relation-extraction].

bert variant, ensemble, relation, (11 more...)

arXiv.org Artificial Intelligence

2112.02955

Country: Europe > France (0.05)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.69)

Add feedback

Microsoft's Tutel optimizes mixture of experts model training

#artificialintelligenceNov-23-2021, 16:20:05 GMT

Let the OSS Enterprise newsletter guide your open source journey! Microsoft this week announced Tutel, a library to support the development of mixture of experts (MoE) models -- a particular type of large-scale AI model. Tutel, which is open source and has been integrated into fairseq, one of Facebook's toolkits in PyTorch, is designed to enable developers across AI disciplines to "execute MoE more easily and efficiently," Microsoft says. MoE are made up of small clusters of "neurons" that are only active under special, specific circumstances. Lower "layers" of the MoE model extract features and experts are called upon to evaluate those features.

expert model training, microsoft, tutel optimize mixture, (13 more...)

#artificialintelligence

Industry: Information Technology > Services (0.72)

Technology:

Information Technology > Communications > Social Media (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.34)

Add feedback

Sentence correction to improve NLP tasks performance

#artificialintelligenceNov-22-2021, 21:27:46 GMT

We have many public platforms and social media platforms for communications, exchange/share of information, expressing feelings, etc… There are many state-of-the-art NLP tasks that run on the text data available on these public or social media platforms, but the test data is not up to the distribution of standard English language which affects the performance of the said tasks. So here we take the input sentence which is corrupted and project it to the target sentence which is in the distribution of standard English. By using this we can improve the performance of most NLP tasks. Input sentences will have corruption and we convert it into standard English while preserving the semantic meaning of the sentences. As mentioned in the research paper, we will be using Sequence cross-entropy (Categorical cross-entropy) as our loss function, where we sum over cross-entropy loss at each time step in predicting the character for the current time step.

100th percentile, max word length, percentile, (11 more...)

#artificialintelligence

Country: Asia > Singapore (0.05)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.34)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.32)

Add feedback

Introducing the First AI Model That Translates 100 Languages Without Relying on English

#artificialintelligenceNov-17-2021, 09:30:34 GMT

Next, we introduced a new bridge mining strategy, in which we group languages into 14 language groups based on linguistic classification, geography, and cultural similarities. People living in countries with languages of the same family tend to communicate more often and would benefit from high-quality translations. For instance, one group would include languages spoken in India, like Bengali, Hindi, Marathi, Nepali, Tamil, and Urdu. To connect the languages of different groups, we identified a small number of bridge languages, which are usually one to three major languages of each group. In the example above, Hindi, Bengali, and Tamil would be bridge languages for Indo-Aryan languages.

bridge language, parallel data, translate 100, (7 more...)

#artificialintelligence

Country: Asia > India (0.27)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.55)

Add feedback

Meta AI Puts A Step Towards Building Universal Translation System

#artificialintelligenceNov-17-2021, 04:15:11 GMT

What does the curve arrow in the logo of Amazon signify? It simply portrays that one can get A to Z products from a single platform, making your task easy, right? The same will be the case when it comes to the translation system (production of text in one language from another). To that end, Meta AI announced a new breakthrough and introduced a new multilingual model, outperforming present state-of-the-art bilingual models across 10 out of 14 language pairs, winning the Conference on Machine Translation (WMT) – a prestigious MT competition. The model thus introduced is a step towards building a universal translation system. We built & open sourced the first-ever multilingual model to win the prestigious WMT competition, showing this approach is the future of machine translation.

building universal translation system, language pair, translation, (10 more...)

#artificialintelligence

Country: North America > United States (0.05)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Top 12 Machine Learning Algorithms You Should Know to Become a Data Scientist

#artificialintelligenceNov-16-2021, 01:40:59 GMT

Let's say I am given an Excel sheet with data about various fruits and I have to tell which look like Apples. What I will do is ask a question "Which fruits are red and round?" and divide all fruits which answer yes and no to the question. Now, All Red and Round fruits might not be apples and all apples won't be red and round. So I will ask a question "Which fruits have red or yellow color hints on them? " on red and round fruits and will ask "Which fruits are green and round?" on not red and round fruits. Based on these questions I can tell with considerable accuracy which are apples. This cascade of questions is what a decision tree is. However, this is a decision tree based on my intuition.

algorithm, github, stable module, (12 more...)

#artificialintelligence

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > Middle East > Qatar (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.73)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.70)

Add feedback

A Cartoon Guide to Language Models in NLP (Part 1: Intuition)

#artificialintelligenceNov-14-2021, 02:45:07 GMT

(This is a crosspost from the official Surge AI blog. If you need help with data labeling and NLP, say hello!) Language models are a core component of NLP systems, from machine translation to speech…

cheeseburger, language model, robot, (13 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.70)

Add feedback

DEEP: DEnoising Entity Pre-training for Neural Machine Translation

Hu, Junjie, Hayashi, Hiroaki, Cho, Kyunghyun, Neubig, Graham

arXiv.org Artificial IntelligenceNov-14-2021

It has been shown that machine translation models usually generate poor translations for named entities that are infrequent in the training corpus. Earlier named entity translation methods mainly focus on phonetic transliteration, which ignores the sentence context for translation and is limited in domain and language coverage. To address this limitation, we propose DEEP, a DEnoising Entity Pre-training method that leverages large amounts of monolingual data and a knowledge base to improve named entity translation accuracy within sentences. Besides, we investigate a multi-task learning strategy that finetunes a pre-trained neural machine translation model on both entity-augmented monolingual data and parallel data to further improve entity translation. Experimental results on three language pairs demonstrate that \method results in significant improvements over strong denoising auto-encoding baselines, with a gain of up to 1.3 BLEU and up to 9.2 entity accuracy points for English-Russian translation.

entity translation, proceedings, translation, (14 more...)

arXiv.org Artificial Intelligence

2111.07393

Country:

Europe > Russia > Southern Federal District > Krasnodar Krai > Krasnodar (0.05)
Europe > Russia > Volga Federal District > Ulyanovsk Oblast > Ulyanovsk (0.05)
Europe > Russia > Volga Federal District > Saratov Oblast > Saratov (0.05)
(4 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Attention Mechanism in Vision Models

#artificialintelligenceNov-13-2021, 00:55:45 GMT

In this article, we would like to explore the attention mechanism and subsequently understand its application in vision models. Attention was first introduced in the paper by Bahdanau et al. for neural machine translation. Attention is a technique that enables a network to focus better on the parts of the input data that is more important to making a prediction. Since being introduced, it has revolutionized the entire field of NLP by being a key component in all the state-of-the-art models for a variety of tasks. The first paper we are discussing is'Attention Is All You Need' published by Google Brain.

architecture, attention mechanism, machine translation, (12 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.93)

Add feedback