AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

Comparing BERT against traditional machine learning text classification

González-Carvajal, Santiago, Garrido-Merchán, Eduardo C.

arXiv.org Machine LearningMay-26-2020

The BERT model has arisen as a popular state-of-the-art machine learning model in the recent years that is able to cope with multiple NLP tasks such as supervised text classification without human supervision. Its flexibility to cope with any type of corpus delivering great results has make this approach very popular not only in academia but also in the industry. Although, there are lots of different approaches that have been used throughout the years with success. In this work, we first present BERT and include a little review on classical NLP approaches. Then, we empirically test with a suite of experiments dealing different scenarios the behaviour of BERT against the traditional TF-IDF vocabulary fed to machine learning algorithms. Our purpose of this work is to add empirical evidence to support or refuse the use of BERT as a default on NLP tasks. Experiments show the superiority of BERT and its independence of features of the NLP problem such as the language of the text adding empirical evidence to use BERT as a default technique to be used in NLP problems.

machine learning, natural language, text classification, (19 more...)

arXiv.org Machine Learning

2005.13012

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.06)
Europe > Spain > Galicia > Madrid (0.05)
Europe > France > Bourgogne-Franche-Comté > Doubs > Besançon (0.04)
Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.74)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.69)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Language barrier hampers distribution of virus info to Hiroshima's foreign residents

The Japan TimesMay-22-2020, 07:01:52 GMT

The language barrier is preventing many foreign residents in Hiroshima Prefecture from keeping abreast of the latest status of the coronavirus pandemic, highlighting the need for municipalities to provide essential information in multiple languages. "It was through social media that I came to know about the whole kyūgyō yōsei thing," Michelle Crothers, an Australian who runs an English conversation school in the city of Hiroshima, said, referring to the Japanese phrase for "request to suspend businesses." On April 18, when the prefecture issued the request in line with the state of emergency declared by the central government, Crothers stumbled upon a friend's social media post written in English about the prefecture's announcement. She then fumbled her way through official websites by the government and the prefecture in hopes of finding out whether her school will have to shut down in line with the request, but ended up giving up. As an extra precaution, she decided to close it for the time being.

artificial intelligence, natural language, social media, (12 more...)

The Japan Times

Country: Asia > Japan > Honshū > Chūgoku > Hiroshima Prefecture > Hiroshima (0.90)

Industry:

Government (0.72)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.55)
Health & Medicine > Therapeutic Area > Immunology (0.55)
Health & Medicine > Epidemiology (0.55)

Technology:

Information Technology > Communications > Social Media (0.75)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.32)

Add feedback

Dual Learning: Theoretical Study and an Algorithmic Extension

Zhao, Zhibing, Xia, Yingce, Qin, Tao, Xia, Lirong, Liu, Tie-Yan

arXiv.org Machine LearningMay-17-2020

Dual learning has been successfully applied in many machine learning applications including machine translation, image-to-image transformation, etc. The high-level idea of dual learning is very intuitive: if we map an $x$ from one domain to another and then map it back, we should recover the original $x$. Although its effectiveness has been empirically verified, theoretical understanding of dual learning is still very limited. In this paper, we aim at understanding why and when dual learning works. Based on our theoretical analysis, we further extend dual learning by introducing more related mappings and propose multi-step dual learning, in which we leverage feedback signals from additional domains to improve the qualities of the mappings. We prove that multi-step dual learn-ing can boost the performance of standard dual learning under mild conditions. Experiments on WMT 14 English$\leftrightarrow$German and MultiUNEnglish$\leftrightarrow$French translations verify our theoretical findings on dual learning, and the results on the translations among English, French, and Spanish of MultiUN demonstrate the effectiveness of multi-step dual learning.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

2005.08238

Country:

North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Europe > Spain (0.04)
Europe > Germany > Berlin (0.04)
Asia (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Boosting Arabic Named Entity Recognition Transliteration with Deep Learning

Alkhatib, Manar (The British University in Dubai ) | Shaalan, Khaled (The British University in Dubai)

AAAI ConferencesMay-16-2020

The task of transliteration of named entities from one language into another is complicated and considered as one of the challenging tasks in machine translation (MT). To build a well performed transliteration system, we apply well-established techniques based on Hybrid Deep Learning. The system based on convolutional neural network (CNN) followed by Bi-LSTM and CRF. The proposed hybrid mechanism is examined on ANERCorp and Kalimat corpus. The results show that the neural machine translation approach can be employed to build efficient machine transliteration systems achieving state-of-the-art results for Arabic – English language.

artificial intelligence, machine learning, natural language, (4 more...)

AAAI Conferences

The Thirty-Third International Flairs Conference

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Impact of a New Word Embedding Cost Function on Farsi-Spanish Low-Resource Neural Machine Translation

Ahmadnia, Benyamin (Tulane University ) | Dorr, Bonnie J. (stitute for Human and Machine Cognition)

AAAI ConferencesMay-16-2020

Neural Machine Translation (NMT) relies heavily on word embeddings, which are continuous representations of words in a vector space, obtained from large monolingual data and, independently, from bilingual data for NMT model training. Word embeddings have proven to be invaluable for performance improvements in natural language analysis tasks that otherwise suffer from data scarcity. This paper defines a new cost function---demonstrated on Farsi-Spanish low-resource attention-based NMT---that encodes word similarity as distances within a word embedding space. The novelty of this cost function is that it encourages our attentional NMT model to generate words that are close to their references in the embedding space. This approach encourages the decoder to select acceptably similar words when potential candidates are found to be Out-Of-Vocabulary (OOV). Experimental results demonstrate improvements of our attentional NMT model over a community-standard NMT baseline model.

artificial intelligence, farsi-spanish low-resource neural machine translation, natural language, (1 more...)

AAAI Conferences

The Thirty-Third International Flairs Conference

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

It's Morphin' Time! Combating Linguistic Discrimination with Inflectional Perturbations

Tan, Samson, Joty, Shafiq, Kan, Min-Yen, Socher, Richard

arXiv.org Artificial IntelligenceMay-9-2020

Training on only perfect Standard English corpora predisposes pre-trained neural networks to discriminate against minorities from non-standard linguistic backgrounds (e.g., African American Vernacular English, Colloquial Singapore English, etc.). We perturb the inflectional morphology of words to craft plausible and semantically similar adversarial examples that expose these biases in popular NLP models, e.g., BERT and Transformer, and show that adversarially fine-tuning them for a single epoch significantly improves robustness without sacrificing performance on clean data.

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2020.acl-main.263

2005.04364

Country:

Asia > Singapore (0.25)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.15)
Europe > France (0.14)
(20 more...)

Genre: Research Report > New Finding (0.46)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.94)
Information Technology > Communications > Social Media (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Dynamically Adjusting Transformer Batch Size by Monitoring Gradient Direction Change

Xu, Hongfei, van Genabith, Josef, Xiong, Deyi, Liu, Qiuhui

arXiv.org Artificial IntelligenceMay-5-2020

The choice of hyper-parameters affects the performance of neural models. While much previous research (Sutskever et al., 2013; Duchi et al., 2011; Kingma and Ba, 2015) focuses on accelerating convergence and reducing the effects of the learning rate, comparatively few papers concentrate on the effect of batch size. In this paper, we analyze how increasing batch size affects gradient direction, and propose to evaluate the stability of gradients with their angle change. Based on our observations, the angle change of gradient direction first tends to stabilize (i.e. gradually decrease) while accumulating mini-batches, and then starts to fluctuate. We propose to automatically and dynamically determine batch sizes by accumulating gradients of mini-batches and performing an optimization step at just the time when the direction of gradients starts to fluctuate. To improve the efficiency of our approach for large models, we propose a sampling approach to select gradients of parameters sensitive to the batch size. Our approach dynamically determines proper and efficient batch sizes during training. In our experiments on the WMT 14 English to German and English to French tasks, our approach improves the Transformer with a fixed 25k batch size by +0.73 and +0.82 BLEU respectively.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2005.02008

Country:

Europe > Germany > Saarland (0.05)
Asia > China > Tianjin Province > Tianjin (0.05)
Oceania > Australia > New South Wales > Sydney (0.05)
(9 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.32)

Add feedback

A Call for More Rigor in Unsupervised Cross-lingual Learning

Artetxe, Mikel, Ruder, Sebastian, Yogatama, Dani, Labaka, Gorka, Agirre, Eneko

arXiv.org Machine LearningApr-30-2020

In work implicitly includes monolingual and natural language processing, the main promise of cross-lingual signals that constitute a departure multilingual learning is to bridge the digital language from the pure setting. We review existing training divide, to enable access to information and signals as well as other signals that may be technology for the world's 6,900 languages (Ruder of interest for future study (§4). We then discuss et al., 2019). For the purpose of this paper, we methodological issues in UCL (e.g., validation, hyperparameter define "multilingual learning" as learning a common tuning) and propose best evaluation model for two or more languages from raw practices (§5). Finally, we provide a unified outlook text, without any downstream task labels. Common of established research areas (cross-lingual use cases include translation as well as pretraining word embeddings, deep multilingual models and multilingual representations. We will use the term unsupervised machine translation) in UCL (§6), interchangeably with "cross-lingual learning".

computational linguistic, linguistic, proceedings, (16 more...)

arXiv.org Machine Learning

doi: 10.18653/v1/2020.acl-main.658

2004.14958

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Belgium > Brussels-Capital Region > Brussels (0.05)
Asia > China > Hong Kong (0.05)
(20 more...)

Genre: Overview (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Bayesian Online Meta-Learning with Laplace Approximation

Yap, Pau Ching, Ritter, Hippolyt, Barber, David

arXiv.org Machine LearningApr-30-2020

Neural networks are known to suffer from catastrophic forgetting when trained on sequential datasets. While there have been numerous attempts to solve this problem for large-scale supervised classification, little has been done to overcome catastrophic forgetting for few-shot classification problems. We demonstrate that the popular gradient-based few-shot meta-learning algorithm Model-Agnostic Meta-Learning (MAML) indeed suffers from catastrophic forgetting and introduce a Bayesian online meta-learning framework that tackles this problem. Our framework incorporates MAML into a Bayesian online learning algorithm with Laplace approximation. This framework enables few-shot classification on a range of sequentially arriving datasets with a single meta-learned model. The experimental evaluations demonstrate that our framework can effectively prevent forgetting in various few-shot classification settings compared to applying MAML sequentially.

approximation, dataset, learning, (15 more...)

arXiv.org Machine Learning

2005.00146

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Indiana > Hamilton County > Fishers (0.04)
North America > Canada > Ontario > Toronto (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Automatic Cross-Replica Sharding of Weight Update in Data-Parallel Training

Xu, Yuanzhong, Lee, HyoukJoong, Chen, Dehao, Choi, Hongjun, Hechtman, Blake, Wang, Shibo

arXiv.org Machine LearningApr-28-2020

In data-parallel synchronous training of deep neural networks, different devices (replicas) run the same program with different partitions of the training batch, but weight update computation is repeated on all replicas, because the weights do not have a batch dimension to partition. This can be a bottleneck for performance and scalability in typical language models with large weights, and models with small per-replica batch size which is typical in large-scale training. This paper presents an approach to automatically shard the weight update computation across replicas with efficient communication primitives and data formatting, using static analysis and transformations on the training computation graph. We show this technique achieves substantial speedups on typical image and language models on Cloud TPUs, requiring no change to model code. This technique helps close the gap between traditionally expensive (ADAM) and cheap (SGD) optimizers, as they will only take a small part of training step time and have similar peak memory usage. It helped us to achieve state-of-the-art training performance in Google's MLPerf 0.6 submission.

operator, replica, weight update, (17 more...)

arXiv.org Machine Learning

2004.13336

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
North America > United States > Georgia > Chatham County > Savannah (0.04)
(4 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback