AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

Artificial Intelligence-OCR and text-translation with python Udemy Coupon

#artificialintelligenceJun-9-2020, 20:53:53 GMT

The translated result is sent to the result queue. The Vision API can detect and extract text from images. You can actually do a lot of things with the help of the Google Translate API ranging from detecting languages to simple text translation, setting source and destination languages, and translating entire lists of text phrases. In this article, you will see how to work with the Google Translate API in the Python programming language. What is Artificial Intelligence: According to the Merriam-Webster dictionary, Artificial Intelligence is "a branch of computer science dealing with the simulation of intelligent behavior in computers" with "the capability of a machine to imitate intelligent human behavior".

artificial intelligence-ocr and text-translation, machine learning, programming language, (8 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (0.39)

Industry:

Information Technology > Services (0.43)
Education > Educational Technology > Educational Software > Computer Based Training (0.40)
Education > Educational Setting > Online (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.94)
Information Technology > Software > Programming Languages (0.76)
Information Technology > Artificial Intelligence > Machine Learning (0.76)

Add feedback

Wat zei je? Detecting Out-of-Distribution Translations with Variational Transformers

Xiao, Tim Z., Gomez, Aidan N., Gal, Yarin

arXiv.org Machine LearningJun-8-2020

We detect out-of-training-distribution sentences in Neural Machine Translation using the Bayesian Deep Learning equivalent of Transformer models. For this we develop a new measure of uncertainty designed specifically for long sequences of discrete random variables -- i.e. words in the output sentence. Our new measure of uncertainty solves a major intractability in the naive application of existing approaches on long sentences. We use our new measure on a Transformer model trained with dropout approximate inference. On the task of German-English translation using WMT13 and Europarl, we show that with dropout uncertainty our measure is able to identify when Dutch source sentences, sentences which use the same word types as German, are given to the model instead of German.

training data, translation, uncertainty estimate, (15 more...)

arXiv.org Machine Learning

2006.08344

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Nevada (0.05)
North America > United States > Washington > King County > Seattle (0.04)
(8 more...)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Growing Together: Modeling Human Language Learning With n-Best Multi-Checkpoint Machine Translation

Nagoudi, El Moatez Billah, Abdul-Mageed, Muhammad, Cavusoglu, Hasan

arXiv.org Machine LearningJun-7-2020

We describe our submission to the 2020 Duolingo Shared Task on Simultaneous Translation And Paraphrase for Language Education (STAPLE) (Mayhew et al., 2020). We view MT models at various training stages (i.e., checkpoints) as human learners at different levels. Hence, we employ an ensemble of multi-checkpoints from the same model to generate translation sequences with various levels of fluency. From each checkpoint, for our best model, we sample n-Best sequences (n=10) with a beam width =100. We achieve 37.57 macro F1 with a 6 checkpoint model ensemble on the official English to Portuguese shared task test data, outperforming a baseline Amazon translation system of 21.30 macro F1 and ultimately demonstrating the utility of our intuitive method.

artificial intelligence, machine translation, natural language, (14 more...)

arXiv.org Machine Learning

2006.0405

Country:

North America > Canada > British Columbia (0.04)
Europe > Sweden > Östergötland County > Linköping (0.04)
Europe > Sweden > Vaestra Goetaland > Gothenburg (0.04)
Europe > Italy > Trentino-Alto Adige/Südtirol > Trentino Province > Trento (0.04)

Genre: Research Report (0.82)

Industry: Education > Curriculum > Subject-Specific Education (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

How Google is using emerging AI techniques to improve language translation quality

#artificialintelligenceJun-4-2020, 03:22:03 GMT

Google says it's made progress toward improving translation quality for languages that don't have a copious amount of written text. In a forthcoming blog post, the company details new innovations that have enhanced the user experience in the 108 languages (particularly in data-poor languages Yoruba and Malayalam) supported by Google Translate, its service that translates an average of 150 billion words daily. In the 13 years since the public debut of Google Translate, techniques like neural machine translation, rewriting-based paradigms, and on-device processing have led to quantifiable leaps in the platform's translation accuracy. But until recently, even the state-of-the-art algorithms underpinning Translate lagged behind human performance. Efforts beyond Google illustrate the magnitude of the problem -- the Masakhane project, which aims to render thousands of languages on the African continent automatically translatable, has yet to move beyond the data-gathering and transcription phase.

artificial intelligence, natural language, translation, (18 more...)

#artificialintelligence

Country: Asia > China > Guangdong Province > Shenzhen (0.06)

Industry: Information Technology > Services (0.39)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Seq2Seq AI Chatbot with Attention Mechanism

Sojasingarayar, Abonia

arXiv.org Artificial IntelligenceJun-4-2020

Intelligent Conversational Agent development using Artificial Intelligence or Machine Learning technique is an interesting problem in the field of Natural Language Processing. In many research projects, they are using Artificial Intelligence, Machine Learning algorithms and Natural Language Processing techniques for developing conversation/dialogue agent. In the past, methods for constructing chatbot architectures have relied on handwritten rules and templates or simple statistical methods. With the rise of deep learning, these models were quickly replaced by end-to-end trainable neural networks around 2015. More specifically, the recurrent encoder-decoder model [Cho et al., 2014] dominates the task of conversational modeling. This architecture was adapted from the neural machine translation domain, where it performs extremely well.

artificial intelligence, machine learning, natural language, (13 more...)

arXiv.org Artificial Intelligence

2006.02767

Country:

North America > United States > California (0.14)
North America > Canada (0.04)
Europe > Poland > Masovia Province > Warsaw (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

PolyDL: Polyhedral Optimizations for Creation of High Performance DL primitives

Tavarageri, Sanket, Heinecke, Alexander, Avancha, Sasikanth, Goyal, Gagandeep, Upadrasta, Ramakrishna, Kaul, Bharat

arXiv.org Artificial IntelligenceJun-2-2020

Deep Neural Networks (DNNs) have revolutionized many aspects of our lives. The use of DNNs is becoming ubiquitous including in softwares for image recognition, speech recognition, speech synthesis, language translation, to name a few. he training of DNN architectures however is computationally expensive. Once the model is created, its use in the intended application - the inference task, is computationally heavy too and the inference needs to be fast for real time use. For obtaining high performance today, the code of Deep Learning (DL) primitives optimized for specific architectures by expert programmers exposed via libraries is the norm. However, given the constant emergence of new DNN architectures, creating hand optimized code is expensive, slow and is not scalable. To address this performance-productivity challenge, in this paper we present compiler algorithms to automatically generate high performance implementations of DL primitives that closely match the performance of hand optimized libraries. We develop novel data reuse analysis algorithms using the polyhedral model to derive efficient execution schedules automatically. In addition, because most DL primitives use some variant of matrix multiplication at their core, we develop a flexible framework where it is possible to plug in library implementations of the same in lieu of a subset of the loops. We show that such a hybrid compiler plus a minimal library-use approach results in state-of-the-art performance. We develop compiler algorithms to also perform operator fusions that reduce data movement through the memory hierarchy of the computer system.

machine learning, natural language, variant, (20 more...)

arXiv.org Artificial Intelligence

2006.0223

Country:

Europe > Austria > Vienna (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Nevada (0.04)
(4 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Cascaded Text Generation with Markov Transformers

Deng, Yuntian, Rush, Alexander M.

arXiv.org Machine LearningJun-1-2020

The two dominant approaches to neural text generation are fully autoregressive models, using serial beam search decoding, and non-autoregressive models, using parallel decoding with no output dependencies. This work proposes an autoregressive model with sub-linear parallel time generation. Noting that conditional random fields with bounded context can be decoded in parallel, we propose an efficient cascaded decoding approach for generating high-quality output. To parameterize this cascade, we introduce a Markov transformer, a variant of the popular fully autoregressive model that allows us to simultaneously decode with specific autoregressive context cutoffs. This approach requires only a small modification from standard autoregressive training, while showing competitive accuracy/speed tradeoff compared to existing methods on five machine translation datasets.

arxiv preprint arxiv, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

2006.01112

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > Vietnam > Hanoi > Hanoi (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)

Add feedback

Covid-19 Is History's Biggest Translation Challenge

WIREDMay-31-2020, 15:48:30 GMT

You, a person who's currently on the English-speaking internet in The Year of The Pandemic, have definitely seen public service information about Covid-19. You've probably been unable to escape seeing quite a lot of it, both online and offline, from handwashing posters to social distancing tape to instructional videos for face covering. But if we want to avoid a pandemic spreading to all the humans in the world, this information also has to reach all the humans of the world--and that means translating Covid PSAs into as many languages as possible, in ways that are accurate and culturally appropriate. It's easy to overlook how important language is for health if you're on the English-speaking internet, where "is this headache actually something to worry about?" is only a quick Wikipedia article or WebMD search away. For over half of the world's population, people can't expect to Google their symptoms, nor even necessarily get a pamphlet from their doctor explaining their diagnosis, because it's not available in a language they can understand.

artificial intelligence, information, natural language, (17 more...)

WIRED

Country:

Oceania > Australia > Northern Territory (0.05)
North America > Guatemala (0.05)
Asia > India (0.05)
(2 more...)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.85)
Information Technology > Communications > Social Media (0.71)

Add feedback

SLAM-Inspired Simultaneous Contextualization and Interpreting for Incremental Conversation Sentences

Takimoto, Yusuke, Fukuchi, Yosuke, Matsumori, Shoya, Imai, Michita

arXiv.org Artificial IntelligenceMay-29-2020

Distributed representation of words has improved the performance for many natural language tasks. In many methods, however, only one meaning is considered for one label of a word, and multiple meanings of polysemous words depending on the context are rarely handled. Although research works have dealt with polysemous words, they determine the meanings of such words according to a batch of large documents. Hence, there are two problems with applying these methods to sequential sentences, as in a conversation that contains ambiguous expressions. The first problem is that the methods cannot sequentially deal with the interdependence between context and word interpretation, in which context is decided by word interpretations and the word interpretations are decided by the context. Context estimation must thus be performed in parallel to pursue multiple interpretations. The second problem is that the previous methods use large-scale sets of sentences for offline learning of new interpretations, and the steps of learning and inference are clearly separated. Such methods using offline learning cannot obtain new interpretations during a conversation. Hence, to dynamically estimate the conversation context and interpretations of polysemous words in sequential sentences, we propose a method of Simultaneous Contextualization And INterpreting (SCAIN) based on the traditional Simultaneous Localization And Mapping (SLAM) algorithm. By using the SCAIN algorithm, we can sequentially optimize the interdependence between context and word interpretation while obtaining new interpretations online. For experimental evaluation, we created two datasets: one from Wikipedia's disambiguation pages and the other from real conversations. For both datasets, the results confirmed that SCAIN could effectively achieve sequential optimization of the interdependence and acquisition of new interpretations.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2005.14662

Country:

Europe > Belgium > Flanders (0.04)
Oceania > New Zealand > North Island > Waikato > Hamilton (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

Add feedback

SAFER: A Structure-free Approach for Certified Robustness to Adversarial Word Substitutions

Ye, Mao, Gong, Chengyue, Liu, Qiang

arXiv.org Machine LearningMay-29-2020

State-of-the-art NLP models can often be fooled by human-unaware transformations such as synonymous word substitution. For security reasons, it is of critical importance to develop models with certified robustness that can provably guarantee that the prediction is can not be altered by any possible synonymous word substitution. In this work, we propose a certified robust method based on a new randomized smoothing technique, which constructs a stochastic ensemble by applying random word substitutions on the input sentences, and leverage the statistical properties of the ensemble to provably certify the robustness. Our method is simple and structure-free in that it only requires the black-box queries of the model outputs, and hence can be applied to any pre-trained models (such as BERT) and any types of models (world-level or subword-level). Our method significantly outperforms recent state-of-the-art methods for certified robustness on both IMDB and Amazon text classification tasks. To the best of our knowledge, we are the first work to achieve certified robustness on large systems such as BERT with practically meaningful certified accuracy.

accuracy, machine learning, natural language, (20 more...)

arXiv.org Machine Learning

2005.14424

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.49)

Add feedback