AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

Unsupervised State Representation Learning in Atari

Anand, Ankesh, Racah, Evan, Ozair, Sherjil, Bengio, Yoshua, Côté, Marc-Alexandre, Hjelm, R Devon

arXiv.org Machine LearningJun-19-2019

State representation learning, or the ability to capture latent generative factors of an environment, is crucial for building intelligent agents that can perform a wide variety of tasks. Learning such representations without supervision from rewards is a challenging open problem. We introduce a method that learns state representations by maximizing mutual information across spatially and temporally distinct features of a neural encoder of the observations. We also introduce a new benchmark based on Atari 2600 games where we evaluate representations based on how well they capture the ground truth state variables. We believe this new framework for evaluating representation learning models will be crucial for future representation learning research. Finally, we compare our technique with other state-of-the-art generative and contrastive representation learning methods.

information, international conference, representation, (14 more...)

arXiv.org Machine Learning

1906.08226

Country:

North America > Canada > Quebec > Montreal (0.05)
North America > United States > Tennessee > Davidson County > Nashville (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.40)

Industry:

Leisure & Entertainment > Games > Computer Games (0.46)
Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.86)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

Add feedback

The Challenge of Open Source Machine Translation

#artificialintelligenceJun-18-2019, 16:41:58 GMT

We live in a time when there is a proliferation of open-source machine learning and AI-related development platforms. Thus, people believe that given a large amount of data and a few computers, a functional and useful MT system can be developed with a do-it-yourself (DIY) tool kit. However, as many who have tried have found out, the reality is much more complicated, and the path to success is long, winding and sometimes even treacherous. The very large majority of open-source MT efforts fail because they do not consistently produce output that is equal to, or better than, any easily accessed public MT solution or because they cannot be deployed effectively. This is not to say that this is not possible, but the investments and long-term commitment required for success are often underestimated or simply not properly understood. A case can always be made for private systems that offer greater control and security, even if they are generally less accurate than public MT options.

artificial intelligence, machine learning, natural language, (5 more...)

#artificialintelligence

Technology:

Information Technology > Software (0.88)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.43)
Information Technology > Artificial Intelligence > Machine Learning (0.42)
Information Technology > Communications > Social Media (0.40)

Add feedback

Misleading Failures of Partial-input Baselines

Feng, Shi, Wallace, Eric, Boyd-Graber, Jordan

arXiv.org Artificial IntelligenceJun-18-2019

Recent work establishes dataset difficulty and removes annotation artifacts via partial-input baselines (e.g., hypothesis-only models for SNLI or question-only models for VQA). When a partial-input baseline gets high accuracy, a dataset is cheatable. However, the converse is not necessarily true: the failure of a partial-input baseline does not mean a dataset is free of artifacts. To illustrate this, we first design artificial datasets which contain trivial patterns in the full input that are undetectable by any partial-input model. Next, we identify such artifacts in the SNLI dataset - a hypothesis-only model augmented with trivial patterns in the premise can solve 15% of the examples that are previously considered "hard". Our work provides a caveat for the use of partial-input baselines for dataset verification and creation.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

1905.05778

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.47)

Add feedback

Bridging the Gap between Training and Inference for Neural Machine Translation

Zhang, Wen, Feng, Yang, Meng, Fandong, You, Di, Liu, Qun

arXiv.org Machine LearningJun-17-2019

Neural Machine Translation (NMT) generates target words sequentially in the way of predicting the next word conditioned on the context words. At training time, it predicts with the ground truth words as context while at inference it has to generate the entire sequence from scratch. This discrepancy of the fed context leads to error accumulation among the way. Furthermore, word-level training requires strict matching between the generated sequence and the ground truth sequence which leads to overcorrection over different but reasonable translations. In this paper, we address these issues by sampling context words not only from the ground truth sequence but also from the predicted sequence by the model during training, where the predicted sequence is selected with a sentence-level optimum. Experiment results on Chinese->English and WMT'14 English->German translation tasks demonstrate that our approach can achieve significant improvements on multiple datasets.

artificial intelligence, natural language, translation, (16 more...)

arXiv.org Machine Learning

1906.02448

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Michigan (0.04)
North America > United States > Massachusetts > Worcester County > Worcester (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry: Government > Regional Government > North America Government > United States Government (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Benchmarking Neural Machine Translation for Southern African Languages

Martinus, Laura, Abbott, Jade Z.

arXiv.org Machine LearningJun-17-2019

Unlike major Western languages, most African languages are very low-resourced. Furthermore, the resources that do exist are often scattered and difficult to obtain and discover. As a result, the data and code for existing research has rarely been shared. This has lead a struggle to reproduce reported results, and few publicly available benchmarks for African machine translation models exist. To start to address these problems, we trained neural machine translation models for 5 Southern African languages on publicly-available datasets. Code is provided for training the models and evaluate the models on a newly released evaluation set, with the aim of spur future research in the field for Southern African languages.

artificial intelligence, machine translation, natural language, (14 more...)

arXiv.org Machine Learning

1906.10511

Country: Africa (0.22)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Multilingual translation tools spread in Japan with new visa system

The Japan TimesJun-16-2019, 10:31:55 GMT

The use of multilingual translation tools is expanding in Japan, where foreign workers are expected to increase in the wake of April's launch of new visa categories. A growing number of local governments, labor unions and other entities have decided to introduce translation tools, which can help foreigners when going through administrative procedures as they allow local officials and other officers to talk to such applicants in their mother languages. "Talking in the applicants' own languages makes it easier to convey our cooperative stance," said an official in Tokyo's Sumida Ward. The ward introduced VoiceBiz, an audio translation app developed by Toppan Printing Co. that covers 30 languages. The app, which can be downloaded onto smartphones and tablet computers, will be used in eight municipalities, including Osaka and Ayase in Kanagawa Prefecture, company officials said.

artificial intelligence, machine translation, natural language, (9 more...)

The Japan Times

AI-Alerts: 2019 > 2019-06 > AAAI AI-Alert for Jun 18, 2019 (1.00)

Country:

Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.60)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.27)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.27)
Asia > Japan > Shikoku > Tokushima Prefecture > Tokushima (0.10)

Industry: Government (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.92)

Add feedback

Agency plus automation: Designing artificial intelligence into interactive systems

#artificialintelligenceJun-14-2019, 08:24:29 GMT

Much contemporary rhetoric regards the prospects and pitfalls of using artificial intelligence techniques to automate an increasing range of tasks, especially those once considered the purview of people alone. These accounts are often wildly optimistic, understating outstanding challenges while turning a blind eye to the human labor that undergirds and sustains ostensibly "automated" services. This long-standing focus on purely automated methods unnecessarily cedes a promising design space: one in which computational assistance augments and enriches, rather than replaces, people's intellectual work. This tension between human agency and machine automation poses vital challenges for design and engineering. In this work, we consider the design of systems that enable rich, adaptive interaction between people and algorithms. We seek to balance the often-complementary strengths and weaknesses of each, while promoting human control and skillful action. We share case studies of interactive systems we have developed in three arenas--data wrangling, exploratory analysis, and natural language translation--that integrate proactive computational support into interactive systems. To improve outcomes and support learning by both people and machines, we describe the use of shared representations of tasks augmented with predictive models of human capabilities and actions. We conclude with a discussion of future prospects and scientific frontiers for intelligence augmentation research. Although sharing overlapping origins in midcentury computer science, research programs in intelligence augmentation (IA; using computers to extend people's ability to process information and reason about complex problems) and artificial intelligence (AI; developing computational methods for perception, reasoning, and action) have to date charted largely separate trajectories.

data mining, machine learning, natural language, (19 more...)

#artificialintelligence

Country: North America > United States (0.28)

Industry:

Health & Medicine (0.49)
Transportation > Air (0.46)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

A Focus on Neural Machine Translation for African Languages

Martinus, Laura, Abbott, Jade Z.

arXiv.org Machine LearningJun-14-2019

African languages are numerous, complex and low-resourced. The datasets required for machine translation are difficult to discover, and existing research is hard to reproduce. Minimal attention has been given to machine translation for African languages so there is scant research regarding the problems that arise when using machine translation techniques. To begin addressing these problems, we trained models to translate English to five of the official South African languages (Afrikaans, isiZulu, Northern Sotho, Setswana, Xitsonga), making use of modern neural machine translation techniques. The results obtained show the promise of using neural machine translation techniques for African languages. By providing reproducible publicly-available data, code and results, this research aims to provide a starting point for other researchers in African machine translation to compare to and build upon.

artificial intelligence, natural language, translation, (17 more...)

arXiv.org Machine Learning

1906.05685

Country:

Africa > South Africa > Gauteng > Johannesburg (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Africa > Southern Africa (0.04)
Africa > South Africa > Western Cape > Cape Town (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Resolving Gendered Ambiguous Pronouns with BERT

Ionita, Matei, Kashnitsky, Yury, Krige, Ken, Larin, Vladimir, Logvinenko, Denis, Atanasov, Atanas

arXiv.org Machine LearningJun-13-2019

Pronoun resolution is part of coreference resolution, the task of pairing an expression to its referring entity. This is an important task for natural language understanding and a necessary component of machine translation systems, chat bots and assistants. Neural machine learning systems perform far from ideally in this task, reaching as low as 73% F1 scores on modern benchmark datasets. Moreover, they tend to perform better for masculine pronouns than for feminine ones. Thus, the problem is both challenging and important for NLP researchers and practitioners. In this project, we describe our BERT-based approach to solving the problem of gender-balanced pronoun resolution. We are able to reach 92% F1 score and a much lower gender bias on the benchmark dataset shared by Google AI Language team.

machine learning, natural language, resolution, (15 more...)

arXiv.org Machine Learning

1906.01161

Country: