AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

Learning Coupled Policies for Simultaneous Machine Translation

Arthur, Philip, Cohn, Trevor, Haffari, Gholamreza

arXiv.org Artificial IntelligenceFeb-11-2020

In simultaneous machine translation, the system needs to incrementally generate the output translation before the input sentence ends. This is a coupled decision process consisting of a programmer and interpreter. The programmer's policy decides about when to WRITE the next output or READ the next input, and the interpreter's policy decides what word to write. We present an imitation learning (IL) approach to efficiently learn effective coupled programmer-interpreter policies. To enable IL, we present an algorithmic oracle to produce oracle READ/WRITE actions for training bilingual sentence-pairs using the notion of word alignments. We attribute the effectiveness of the learned coupled policies to (i) scheduled sampling addressing the coupled exposure bias, and (ii) quality of oracle actions capturing enough information from the partial input before writing the output. Experiments show our method outperforms strong baselines in terms of translation quality and delay, when translating from German/Arabic/Czech/Bulgarian/Romanian to English.

interpreter, programmer, translation, (14 more...)

arXiv.org Artificial Intelligence

2002.04306

Country:

Europe > Italy > Tuscany > Florence (0.04)
Europe > Spain > Valencian Community > Valencia Province > Valencia (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Combining Machine Learning with Knowledge-Based Modeling for Scalable Forecasting and Subgrid-Scale Closure of Large, Complex, Spatiotemporal Systems

Wikner, Alexander, Pathak, Jaideep, Hunt, Brian, Girvan, Michelle, Arcomano, Troy, Szunyogh, Istvan, Pomerance, Andrew, Ott, Edward

arXiv.org Machine LearningFeb-10-2020

We consider the commonly encountered situation (e.g., in weather forecasting) where the goal is to predict the time evolution of a large, spatiotemporally chaotic dynamical system when we have access to both time series data of previous system states and an imperfect model of the full system dynamics. Specifically, we attempt to utilize machine learning as the essential tool for integrating the use of past data into predictions. In order to facilitate scalability to the common scenario of interest where the spatiotemporally chaotic system is very large and complex, we propose combining two approaches:(i) a parallel machine learning prediction scheme; and (ii) a hybrid technique, for a composite prediction system composed of a knowledge-based component and a machine-learning-based component. We demonstrate that not only can this method combining (i) and (ii) be scaled to give excellent performance for very large systems, but also that the length of time series data needed to train our multiple, parallel machine learning components is dramatically less than that necessary without parallelization. Furthermore, considering cases where computational realization of the knowledge-based component does not resolve subgrid-scale processes, our scheme is able to use training data to incorporate the effect of the unresolved short-scale dynamics upon the resolved longer-scale dynamics ("subgrid-scale closure").

prediction, reservoir, training data, (16 more...)

arXiv.org Machine Learning

2002.05514

Country:

North America > United States > Maryland > Prince George's County > College Park (0.15)
North America > United States > Virginia > Alexandria County > Alexandria (0.04)
North America > United States > Texas > Brazos County > College Station (0.04)
(4 more...)

Genre: Research Report (0.50)

Industry: Energy (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.92)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.67)

Add feedback

Pre-training Tasks for Embedding-based Large-scale Retrieval

Chang, Wei-Cheng, Yu, Felix X., Chang, Yin-Wen, Yang, Yiming, Kumar, Sanjiv

arXiv.org Machine LearningFeb-10-2020

We consider the large-scale query-document retrieval problem: given a query (e.g., a question), return the set of relevant documents (e.g., paragraphs containing the answer) from a large document corpus. This problem is often solved in two steps. The retrieval phase first reduces the solution space, returning a subset of candidate documents. The scoring phase then re-ranks the documents. Critically, the retrieval algorithm not only desires high recall but also requires to be highly efficient, returning candidates in time sublinear to the number of documents. Unlike the scoring phase witnessing significant advances recently due to the BERT-style pre-training tasks on cross-attention models, the retrieval phase remains less well studied. Most previous works rely on classic Information Retrieval (IR) methods such as BM-25 (token matching + TF-IDF weights). These models only accept sparse handcrafted features and can not be optimized for different downstream tasks of interest. In this paper, we conduct a comprehensive study on the embedding-based retrieval models. We show that the key ingredient of learning a strong embedding-based Transformer model is the set of pre-training tasks. With adequately designed paragraph-level pre-training tasks, the Transformer models can remarkably improve over the widely-used BM-25 as well as embedding models without Transformers. The paragraph-level pre-training tasks we studied are Inverse Cloze Task (ICT), Body First Selection (BFS), Wiki Link Prediction (WLP), and the combination of all three.

pre-training task, proceedings, transformer model, (15 more...)

arXiv.org Machine Learning

2002.03932

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.48)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)

Add feedback

Evaluating Sequence-to-Sequence Learning Models for If-Then Program Synthesis

Dalal, Dhairya, Galbraith, Byron V.

arXiv.org Machine LearningFeb-9-2020

Implementing enterprise process automation often requires significant technical expertise and engineering effort. It would be beneficial for non-technical users to be able to describe a business process in natural language and have an intelligent system generate the workflow that can be automatically executed. A building block of process automations are If-Then programs. In the consumer space, sites like IFTTT and Zapier allow users to create automations by defining If-Then programs using a graphical interface. We explore the efficacy of modeling If-Then programs as a sequence learning task. We find Seq2Seq approaches have high potential (performing strongly on the Zapier recipes) and can serve as a promising approach to more complex program synthesis challenges.

dataset, recipe, sequence, (15 more...)

arXiv.org Machine Learning

2002.03485

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.05)
Europe > Germany > Berlin (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.84)

Industry: Information Technology (0.43)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.64)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Time-aware Large Kernel Convolutions

Lioutas, Vasileios, Guo, Yuhong

arXiv.org Machine LearningFeb-8-2020

To date, most state-of-the-art sequence modelling architectures use attention to build generative models for language based tasks. Some of these models use all the available sequence tokens to generate an attention distribution which results in time complexity of $O(n^2)$. Alternatively, they utilize depthwise convolutions with softmax normalized kernels of size $k$ acting as a limited-window self-attention, resulting in time complexity of $O(k{\cdot}n)$. In this paper, we introduce Time-aware Large Kernel (TaLK) Convolutions, a novel adaptive convolution operation that learns to predict the size of a summation kernel instead of using the fixed-sized kernel matrix. This method yields a time complexity of $O(n)$, effectively making the sequence encoding process linear to the number of tokens. We evaluate the proposed method on large-scale standard machine translation and language modelling datasets and show that TaLK Convolutions constitute an efficient improvement over other attention/convolution based approaches.

convolution, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

2002.03184

Country: North America > Canada > Ontario > National Capital Region > Ottawa (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CCMatrix: A billion-scale bitext data set for training translation models

#artificialintelligenceFeb-7-2020, 18:41:06 GMT

CCMatrix is the largest data set of high-quality, web-based bitexts for training translation models. With more than 4.5 billion parallel sentences in 576 language pairs pulled from snapshots of the CommonCrawl public data set, CCMatrix is more than 50 times larger than the WikiMatrix corpus that we shared last year. Gathering a data set of this size required modifying our previous bitext mining approach used for WikiMatrix, assuming that the translation of one sentence could be found anywhere on CommonCrawl, which functions as an open archive of the internet. To address the significant computational challenges posed by comparing billions of sentences to determine which ones are mutual translations, we used massively parallel processing, as well as our highly efficient FAISS library for fast similarity searches. We're sharing details about how we created CCMatrix, and the tools needed for other researchers to reproduce our results and use this corpus for their work.

billion-scale bitext data, ccmatrix, training translation model, (7 more...)

#artificialintelligence

Genre: Research Report (0.37)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Communications > Social Media (0.87)

Add feedback

Translating Web Search Queries into Natural Language Questions

Kumar, Adarsh, Dandapat, Sandipan, Chordia, Sushil

arXiv.org Artificial IntelligenceFeb-7-2020

Users often query a search engine with a specific question in mind and often these queries are keywords or sub-sentential fragments. For example, if the users want to know the answer for "What's the capital of USA", they will most probably query "capital of USA" or "USA capital" or some keyword-based variation of this. For example, for the user entered query "capital of USA", the most probable question intent is "What's the capital of USA?". In this paper, we are proposing a method to generate well-formed natural language question from a given keyword-based query, which has the same question intent as the query. Conversion of keyword-based web query into a well-formed question has lots of applications, with some of them being in search engines, Community Question Answering (CQA) website and bots communication. We found a synergy between query-to-question problem with standard machine translation(MT) task. We have used both Statistical MT (SMT) and Neural MT (NMT) models to generate the questions from the query. We have observed that MT models perform well in terms of both automatic and human evaluation.

query, question intent, search engine, (13 more...)

arXiv.org Artificial Intelligence

2002.02631

Country:

North America > United States > Kansas (0.05)
Asia > Middle East > UAE > Dubai Emirate > Dubai (0.05)
Europe > Bulgaria > Sofia City Province > Sofia (0.04)
(2 more...)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.70)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Consistency of a Recurrent Language Model With Respect to Incomplete Decoding

Welleck, Sean, Kulikov, Ilia, Kim, Jaedeok, Pang, Richard Yuanzhe, Cho, Kyunghyun

arXiv.org Machine LearningFeb-6-2020

Despite strong performance on a variety of tasks, neural sequence models trained with maximum likelihood have been shown to exhibit issues such as length bias and degenerate repetition. We study the related issue of receiving infinite-length sequences from a recurrent language model when using common decoding algorithms. To analyze this issue, we first define inconsistency of a decoding algorithm, meaning that the algorithm can yield an infinite-length sequence that has zero probability under the model. We prove that commonly used incomplete decoding algorithms - greedy search, beam search, top-k sampling, and nucleus sampling - are inconsistent, despite the fact that recurrent language models are trained to produce sequences of finite length. Based on these insights, we propose two remedies which address inconsistency: consistent variants of top-k and nucleus sampling, and a self-terminating recurrent language model. Empirical results show that inconsistency occurs in practice, and that the proposed methods prevent inconsistency.

algorithm, language model, sequence, (13 more...)

arXiv.org Machine Learning

2002.02492

Country:

North America > United States > Texas (0.04)
North America > United States > New York (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(5 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

Add feedback

Translate this: How real-time translation breaks down barriers when you don't speak the language

USATODAY - Tech Top StoriesFeb-5-2020, 12:50:55 GMT

In the sci-fi world crafted by Douglas Adams in "The Hitchhiker's Guide to the Galaxy," you'd just slap a bright yellow Babel fish in your ear and simply be able to understand any mix of languages around you. While we aren't quite there yet, language is becoming less of a barrier than in generations past. "Understanding is going to become the new normal," says Dave Limp, Amazon's senior vice president of devices and services. Kids "will never grow up in world where they aren't able to hear any language. To that end, today's technology is helping to interpret and translate the world around us in ways that are nearing seamless and in real time. From apps on your phone to increasingly multilingual virtual personal assistants, communicating as a tourist or with clients, friends and family who don't speak the same language is less of a challenge. Yet for all the authentique gains achieved in translation over the past several years, don't count on your phone, smart speaker, PC or ear device ...

artificial intelligence, natural language, social media, (15 more...)

USATODAY - Tech Top Stories

Country:

Asia > China (0.48)
North America > United States > California > San Francisco County > San Francisco (0.15)

Industry:

Health & Medicine (0.74)
Information Technology (0.70)
Government (0.48)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Smart Language Translation Solutions and Software for Enterprise - Lingmo International

#artificialintelligenceFeb-4-2020, 04:22:23 GMT

We understand when the language barrier is removed it is easier to communicate with your foreign speaking consumers. We can help you speak to your customers in 80 languages and scale into new international markets with our smart translation solutions.

language translation solution and software, smart language translation solution

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.40)

Add feedback