AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

PaLM: Scaling Language Modeling with Pathways

Chowdhery, Aakanksha, Narang, Sharan, Devlin, Jacob, Bosma, Maarten, Mishra, Gaurav, Roberts, Adam, Barham, Paul, Chung, Hyung Won, Sutton, Charles, Gehrmann, Sebastian, Schuh, Parker, Shi, Kensen, Tsvyashchenko, Sasha, Maynez, Joshua, Rao, Abhishek, Barnes, Parker, Tay, Yi, Shazeer, Noam, Prabhakaran, Vinodkumar, Reif, Emily, Du, Nan, Hutchinson, Ben, Pope, Reiner, Bradbury, James, Austin, Jacob, Isard, Michael, Gur-Ari, Guy, Yin, Pengcheng, Duke, Toju, Levskaya, Anselm, Ghemawat, Sanjay, Dev, Sunipa, Michalewski, Henryk, Garcia, Xavier, Misra, Vedant, Robinson, Kevin, Fedus, Liam, Zhou, Denny, Ippolito, Daphne, Luan, David, Lim, Hyeontaek, Zoph, Barret, Spiridonov, Alexander, Sepassi, Ryan, Dohan, David, Agrawal, Shivani, Omernick, Mark, Dai, Andrew M., Pillai, Thanumalayan Sankaranarayana, Pellat, Marie, Lewkowycz, Aitor, Moreira, Erica, Child, Rewon, Polozov, Oleksandr, Lee, Katherine, Zhou, Zongwei, Wang, Xuezhi, Saeta, Brennan, Diaz, Mark, Firat, Orhan, Catasta, Michele, Wei, Jason, Meier-Hellstern, Kathy, Eck, Douglas, Dean, Jeff, Petrov, Slav, Fiedel, Noah

arXiv.org Artificial IntelligenceOct-5-2022

Large language models have been shown to achieve remarkable performance across a variety of natural language tasks using few-shot learning, which drastically reduces the number of task-specific training examples needed to adapt the model to a particular application. To further our understanding of the impact of scale on few-shot learning, we trained a 540-billion parameter, densely activated, Transformer language model, which we call Pathways Language Model PaLM. We trained PaLM on 6144 TPU v4 chips using Pathways, a new ML system which enables highly efficient training across multiple TPU Pods. We demonstrate continued benefits of scaling by achieving state-of-the-art few-shot learning results on hundreds of language understanding and generation benchmarks. On a number of these tasks, PaLM 540B achieves breakthrough performance, outperforming the finetuned state-of-the-art on a suite of multi-step reasoning tasks, and outperforming average human performance on the recently released BIG-bench benchmark. A significant number of BIG-bench tasks showed discontinuous improvements from model scale, meaning that performance steeply increased as we scaled to our largest model. PaLM also has strong capabilities in multilingual tasks and source code generation, which we demonstrate on a wide array of benchmarks. We additionally provide a comprehensive analysis on bias and toxicity, and study the extent of training data memorization with respect to model scale. Finally, we discuss the ethical considerations related to large language models and discuss potential mitigation strategies.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2204.02311

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
(26 more...)

Genre: Research Report > New Finding (0.45)

Industry:

Leisure & Entertainment (1.00)
Law (1.00)
Health & Medicine (1.00)
(7 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(4 more...)

Add feedback

Deps-SAN: Neural Machine Translation with Dependency-Scaled Self-Attention Network

Peng, Ru, Lin, Nankai, Fang, Yi, Jiang, Shengyi, Hao, Tianyong, Chen, Boyu, Zhao, Junbo

arXiv.org Artificial IntelligenceOct-4-2022

Syntax knowledge contributes its powerful strength in Neural machine translation (NMT) tasks. Early NMT works supposed that syntax details can be automatically learned from numerous texts via attention networks. However, succeeding researches pointed out that limited by the uncontrolled nature of attention computation, the NMT model requires an external syntax to capture the deep syntactic awareness. Although existing syntax-aware NMT methods have born great fruits in combining syntax, the additional workloads they introduced render the model heavy and slow. Particularly, these efforts scarcely involve the Transformer-based NMT and modify its core self-attention network (SAN). To this end, we propose a parameter-free, Dependency-scaled Self-Attention Network (Deps-SAN) for syntax-aware Transformer-based NMT. A quantified matrix of dependency closeness between tokens is constructed to impose explicit syntactic constraints into the SAN for learning syntactic details and dispelling the dispersion of attention distributions. Two knowledge sparsing techniques are further integrated to avoid the model overfitting the dependency noises introduced by the external parser. Experiments and analyses on IWSLT14 German-to-English and WMT16 German-to-English benchmark NMT tasks verify the effectiveness of our approach.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2111.11707

Country:

Asia > China (0.05)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.55)

Add feedback

Improving Both Domain Robustness and Domain Adaptability in Machine Translation

Lai, Wen, Libovický, Jindřich, Fraser, Alexander

arXiv.org Artificial IntelligenceOct-4-2022

We consider two problems of NMT domain adaptation using meta-learning. First, we want to reach domain robustness, i.e., we want to reach high quality on both domains seen in the training data and unseen domains. Second, we want our systems to be adaptive, i.e., making it possible to finetune systems with just hundreds of in-domain parallel sentences. We study the domain adaptability of meta-learning when improving the domain robustness of the model. In this paper, we propose a novel approach, RMLNMT (Robust Meta-Learning Framework for Neural Machine Translation Domain Adaptation), which improves the robustness of existing meta-learning models. More specifically, we show how to use a domain classifier in curriculum learning and we integrate the word-level domain mixing model into the meta-learning framework with a balanced sampling strategy. Experiments on English$\rightarrow$German and English$\rightarrow$Chinese translation show that RMLNMT improves in terms of both domain robustness and domain adaptability in seen and unseen domains. Our source code is available at https://github.com/lavine-lmu/RMLNMT.

computational linguistic, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2112.08288

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Asia > China > Hong Kong (0.04)
(14 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Exploring Continuous Integrate-and-Fire for Adaptive Simultaneous Speech Translation

Chang, Chih-Chiang, Lee, Hung-yi

arXiv.org Artificial IntelligenceOct-3-2022

Simultaneous speech translation (SimulST) is a challenging task aiming to translate streaming speech before the complete input is observed. A SimulST system generally includes two components: the pre-decision that aggregates the speech information and the policy that decides to read or write. While recent works had proposed various strategies to improve the pre-decision, they mainly adopt the fixed wait-k policy, leaving the adaptive policies rarely explored. This paper proposes to model the adaptive policy by adapting the Continuous Integrate-and-Fire (CIF). Compared with monotonic multihead attention (MMA), our method has the advantage of simpler computation, superior quality at low latency, and better generalization to long utterances. We conduct experiments on the MuST-C V2 dataset and show the effectiveness of our approach.

machine learning, natural language, translation, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.21437/Interspeech.2022-10627

2204.09595

Country:

Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Asia > China (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(4 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

The boundaries of meaning: a case study in neural machine translation

Balashov, Yuri

arXiv.org Artificial IntelligenceOct-2-2022

The success of deep learning in natural language processing raises intriguing questions about the nature of linguistic meaning and ways in which it can be processed by natural and artificial systems. One such question has to do with subword segmentation algorithms widely employed in language modeling, machine translation, and other tasks since 2016. These algorithms often cut words into semantically opaque pieces, such as 'period', 'on', 't', and 'ist' in 'period|on|t|ist'. The system then represents the resulting segments in a dense vector space, which is expected to model grammatical relations among them. This representation may in turn be used to map 'period|on|t|ist' (English) to 'par|od|ont|iste' (French). Thus, instead of being modeled at the lexical level, translation is reformulated more generally as the task of learning the best bilingual mapping between the sequences of subword segments of two languages; and sometimes even between pure character sequences: 'p|e|r|i|o|d|o|n|t|i|s|t' $\rightarrow$ 'p|a|r|o|d|o|n|t|i|s|t|e'. Such subword segmentations and alignments are at work in highly efficient end-to-end machine translation systems, despite their allegedly opaque nature. The computational value of such processes is unquestionable. But do they have any linguistic or philosophical plausibility? I attempt to cast light on this question by reviewing the relevant details of the subword segmentation algorithms and by relating them to important philosophical and linguistic debates, in the spirit of making artificial intelligence more transparent and explainable.

computational linguistic, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1080/0020174X.2022.2113429

2210.00613

Country:

North America > United States > New Jersey > Middlesex County > New Brunswick (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Georgia > Clarke County > Athens (0.14)
(21 more...)

Genre:

Overview (0.67)
Research Report (0.64)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

GitHub - NiuTrans/MTBook: 《机器翻译：基础与模型》肖桐朱靖波著 - Machine Translation: Foundations and Models

#artificialintelligenceOct-1-2022, 02:00:34 GMT

《机器翻译：基础与模型》肖桐朱靖波著 - Machine Translation: Foundations and Models - GitHub - NiuTrans/MTBook: 《机器翻译：基础与模型》肖桐朱靖波著 - Machine Translation: Foundations and Models

foundation and model, machine translation, size 5000000, (5 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.80)

Add feedback

MALM: Mixing Augmented Language Modeling for Zero-Shot Machine Translation

Gupta, Kshitij

arXiv.org Artificial IntelligenceOct-1-2022

Large pre-trained language models have brought remarkable progress in NLP. Pre-training and Fine-tuning have given state-of-art performance across tasks in text processing. Data Augmentation techniques have also helped build state-of-art models on low or zero resource tasks. Many works in the past have attempted at learning a single massively-multilingual machine translation model for zero-shot translation. Although those translation models are producing correct translations, the main challenge is those models are producing the wrong languages for zero-shot translation. This work and its results indicate that prompt conditioned large models do not suffer from off-target language errors i.e. errors arising due to translation to wrong languages. We empirically demonstrate the effectiveness of self-supervised pre-training and data augmentation for zero-shot multi-lingual machine translation.

large language model, natural language, translation, (13 more...)

arXiv.org Artificial Intelligence

2210.0032

Country:

Asia > India > Rajasthan (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Refining Low-Resource Unsupervised Translation by Language Disentanglement of Multilingual Model

Nguyen, Xuan-Phi, Joty, Shafiq, Kui, Wu, Aw, Ai Ti

arXiv.org Artificial IntelligenceOct-1-2022

Numerous recent work on unsupervised machine translation (UMT) implies that competent unsupervised translations of low-resource and unrelated languages, such as Nepali or Sinhala, are only possible if the model is trained in a massive multilingual environment, where these low-resource languages are mixed with high-resource counterparts. Nonetheless, while the high-resource languages greatly help kick-start the target low-resource translation tasks, the language discrepancy between them may hinder their further improvement. In this work, we propose a simple refinement procedure to separate languages from a pre-trained multilingual UMT model for it to focus on only the target low-resource task. Our method achieves the state of the art in the fully unsupervised translation tasks of English to Nepali, Sinhala, Gujarati, Latvian, Estonian and Kazakh, with BLEU score gains of 3.5, 3.5, 3.3, 4.1, 4.2, and 3.3, respectively.

machine learning, natural language, translation, (17 more...)

arXiv.org Artificial Intelligence

2205.15544

Country:

Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Europe > Italy > Tuscany > Florence (0.04)
Asia > China > Hong Kong (0.04)
(7 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.92)

Add feedback

Can AI help to increase access to all languages?

#artificialintelligenceSep-30-2022, 15:14:58 GMT

Languages are the main medium of communication but there are more than 7,100 languages spoken around the world. People who live in different parts of the world speak different languages and it's sometimes hard to communicate with people who don't speak our language. This hinders relationships between people and makes it hard to understand one another or build trust. The ability to translate language, then, makes it easier to communicate across borders, and make information more accessible. With the advances in technology and artificial intelligence, online translators such as Google Translate, DeepL, and Bing Translate have made communication a lot easier among those speaking different languages.

machine translation, translation, translator, (16 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.42)

Add feedback

Calibrating Sequence likelihood Improves Conditional Language Generation

Zhao, Yao, Khalman, Misha, Joshi, Rishabh, Narayan, Shashi, Saleh, Mohammad, Liu, Peter J.

arXiv.org Artificial IntelligenceSep-30-2022

Conditional language models are predominantly trained with maximum likelihood estimation (MLE), giving probability mass to sparsely observed target sequences. While MLE trained models assign high probability to plausible sequences given the context, the model probabilities often do not accurately rank-order generated sequences by quality. This has been empirically observed in beam search decoding as output quality degrading with large beam sizes, and decoding strategies benefiting from heuristics such as length normalization and repetition-blocking. In this work, we introduce sequence likelihood calibration (SLiC) where the likelihood of model generated sequences are calibrated to better align with reference sequences in the model's latent space. With SLiC, decoding heuristics become unnecessary and decoding candidates' quality significantly improves regardless of the decoding method. Furthermore, SLiC shows no sign of diminishing returns with model scale, and presents alternative ways to improve quality with limited training and inference budgets. With SLiC, we exceed or match SOTA results on a wide range of generation tasks spanning abstractive summarization, question generation, abstractive question answering and data-to-text generation, even with modest-sized models.

computational linguistic, machine learning, question answering, (19 more...)

arXiv.org Artificial Intelligence

2210.00045

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(10 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.69)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.55)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.54)
(2 more...)

Add feedback