AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

How Positional Encoding work(Transformers)part2

#artificialintelligenceOct-16-2022, 19:36:32 GMT

Abstract: Adapting Deep Learning (DL) techniques to automate non-trivial coding activities, such as code documentation and defect detection, has been intensively studied recently. Learning to predict code changes is one of the popular and essential investigations. Prior studies have shown that DL techniques such as Neural Machine Translation (NMT) can benefit meaningful code changes, including bug fixing and code refactoring. However, NMT models may encounter bottleneck when modeling long sequences, thus are limited in accurately predicting code changes. In this work, we design a Transformer-based approach, considering that Transformer has proven effective in capturing long-term dependencies.

code change, positional encoding work, transformer, (10 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.92)

Add feedback

How to boost your business with natural language processing (NLP)

#artificialintelligenceOct-16-2022, 01:14:01 GMT

Natural language processing (NLP) is a powerful combination of linguistics and computer science that, through the study of language and the creation of intelligent systems, makes human language as intelligible to machines as it would be for a human being, whether in text or speech format. As a branch of artificial intelligence (AI), NLP enables computers and machines to understand, interpret and manipulate human language using computational linguistics and statistical models, machine learning methods and deep learning processes. The knowledge extracted by these technologies is converted into algorithms that teach machines to perform a myriad of tasks that are infinitely valuable to businesses. The more data NLP algorithms receive, the more precise text analysis models become. NLP includes an immense diversity of techniques, from statistical and machine learning methods to algorithmic and rule-based approaches.

artificial intelligence, machine learning, natural language, (18 more...)

#artificialintelligence

Industry: Information Technology > Security & Privacy (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.74)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.72)

Add feedback

Increasing Visual Awareness in Multimodal Neural Machine Translation from an Information Theoretic Perspective

Ji, Baijun, Zhang, Tong, Zou, Yicheng, Hu, Bojie, Shen, Si

arXiv.org Artificial IntelligenceOct-16-2022

Multimodal machine translation (MMT) aims to improve translation quality by equipping the source sentence with its corresponding image. Despite the promising performance, MMT models still suffer the problem of input degradation: models focus more on textual information while visual information is generally overlooked. In this paper, we endeavor to improve MMT performance by increasing visual awareness from an information theoretic perspective. In detail, we decompose the informative visual signals into two parts: source-specific information and target-specific information. We use mutual information to quantify them and propose two methods for objective optimization to better leverage visual signals. Experiments on two datasets demonstrate that our approach can effectively enhance the visual awareness of MMT model and achieve superior results against strong baselines.

artificial intelligence, information, natural language, (17 more...)

arXiv.org Artificial Intelligence

2210.08478

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

RedApt: An Adaptor for wav2vec 2 Encoding \\ Faster and Smaller Speech Translation without Quality Compromise

Zhao, Jinming, Yang, Hao, Haffari, Gholamreza, Shareghi, Ehsan

arXiv.org Artificial IntelligenceOct-16-2022

Pre-trained speech Transformers in speech translation (ST) have facilitated state-of-the-art (SotA) results; yet, using such encoders is computationally expensive. To improve this, we present a novel Reducer Adaptor block, RedApt, that could be seamlessly integrated within any Transformer-based speech encoding architecture. Integrating the pretrained wav2vec 2 speech encoder with RedAptbrings 41% speedup, 33% memory reduction with 24% fewer FLOPs at inference. To our positive surprise, our ST model with RedApt outperforms the SotA architecture by an average of 0.68 BLEU score on 8 language pairs from Must-C.

machine learning, natural language, translation, (22 more...)

arXiv.org Artificial Intelligence

2210.08475

Country:

North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Brief Review -- Unsupervised Machine Translation Using Monolingual Corpora Only

#artificialintelligenceOct-15-2022, 15:09:20 GMT

With the use of GAN idea, NMT model can be trained without parallel data, in which I think it is similar to the CycleGAN in image domain. 2013 … 2018 [UMNT] … 2020 [Batch Augment, BA] [GPT-3] [T5]…

brief review, monolingual corpora only, unsupervised machine translation

#artificialintelligence

Genre: Overview (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.53)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.40)

Add feedback

Modeling Context With Linear Attention for Scalable Document-Level Translation

Wu, Zhaofeng, Peng, Hao, Pappas, Nikolaos, Smith, Noah A.

arXiv.org Artificial IntelligenceOct-15-2022

Document-level machine translation leverages inter-sentence dependencies to produce more coherent and consistent translations. However, these models, predominantly based on transformers, are difficult to scale to long documents as their attention layers have quadratic complexity in the sequence length. Recent efforts on efficient attention improve scalability, but their effect on document translation remains unexplored. In this work, we investigate the efficacy of a recent linear attention model by Peng et al. (2021) on document translation and augment it with a sentential gate to promote a recency inductive bias. We evaluate the model on IWSLT 2015 and OpenSubtitles 2018 against the transformer, demonstrating substantially increased decoding speed on long sequences with similar or better BLEU scores. We show that sentential gating further improves translation quality on IWSLT.

machine learning, natural language, translation, (20 more...)

arXiv.org Artificial Intelligence

2210.08431

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

The User-Aware Arabic Gender Rewriter

Alhafni, Bashar, Obeid, Ossama, Habash, Nizar

arXiv.org Artificial IntelligenceOct-14-2022

We introduce the User-Aware Arabic Gender Rewriter, a user-centric web-based system for Arabic gender rewriting in contexts involving two users. The system takes either Arabic or English sentences as input, and provides users with the ability to specify their desired first and/or second person target genders. The system outputs gender rewritten alternatives of the Arabic input sentences (or their Arabic translations in case of English input) to match the target users' gender preferences.

artificial intelligence, computational linguistic, natural language, (14 more...)

arXiv.org Artificial Intelligence

2210.07538

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Washington > King County > Seattle (0.14)
Europe > Italy > Tuscany > Florence (0.05)
(19 more...)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Picard understanding Darmok: A Dataset and Model for Metaphor-Rich Translation in a Constructed Language

Jansen, Peter, Boyd-Graber, Jordan

arXiv.org Artificial IntelligenceOct-14-2022

Tamarian, a fictional language introduced in the Star Trek episode Darmok, communicates meaning through utterances of metaphorical references, such as "Darmok and Jalad at Tanagra" instead of "We should work together." This work assembles a Tamarian-English dictionary of utterances from the original episode and several follow-on novels, and uses this to construct a parallel corpus of 456 English-Tamarian utterances. A machine translation system based on a large language model (T5) is trained using this parallel corpus, and is shown to produce an accuracy of 76% when translating from English to Tamarian on known utterances.

artificial intelligence, natural language, utterance, (17 more...)

arXiv.org Artificial Intelligence

2107.08146

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > Arizona (0.05)
North America > United States > New York > Kings County > New York City (0.04)
(3 more...)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

MaskEval: Weighted MLM-Based Evaluation for Text Summarization and Simplification

Liu, Yu Lu, Bawden, Rachel, Scialom, Thomas, Sagot, Benoît, Cheung, Jackie Chi Kit

arXiv.org Artificial IntelligenceOct-13-2022

In text summarization and simplification, system outputs must be evaluated along multiple dimensions such as relevance, factual consistency, fluency, and grammaticality, and a wide range of possible outputs could be of high quality. These properties make the development of an adaptable, reference-less evaluation metric both necessary and challenging. We introduce MaskEval, a reference-less metric for text summarization and simplification that operates by performing masked language modeling (MLM) on the concatenation of the candidate and the source texts. It features an attention-like weighting mechanism to modulate the relative importance of each MLM step, which crucially allows it to be adapted to evaluate different quality dimensions. We demonstrate its effectiveness on English summarization and simplification in terms of correlations with human judgments, and explore transfer scenarios between the two tasks.

artificial intelligence, computational linguistic, natural language, (17 more...)

arXiv.org Artificial Intelligence

2205.12394

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Quebec > Montreal (0.14)
North America > Dominican Republic (0.04)
(12 more...)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.93)

Add feedback

On the Explainability of Natural Language Processing Deep Models

Zini, Julia El, Awad, Mariette

arXiv.org Artificial IntelligenceOct-13-2022

While there has been a recent explosion of work on ExplainableAI ExAI on deep models that operate on imagery and tabular data, textual datasets present new challenges to the ExAI community. Such challenges can be attributed to the lack of input structure in textual data, the use of word embeddings that add to the opacity of the models and the difficulty of the visualization of the inner workings of deep models when they are trained on textual data. Lately, methods have been developed to address the aforementioned challenges and present satisfactory explanations on Natural Language Processing (NLP) models. However, such methods are yet to be studied in a comprehensive framework where common challenges are properly stated and rigorous evaluation practices and metrics are proposed. Motivated to democratize ExAI methods in the NLP field, we present in this work a survey that studies model-agnostic as well as model-specific explainability methods on NLP models. Such methods can either develop inherently interpretable NLP models or operate on pre-trained models in a post-hoc manner. We make this distinction and we further decompose the methods into three categories according to what they explain: (1) word embeddings (input-level), (2) inner workings of NLP models (processing-level) and (3) models' decisions (output-level). We also detail the different evaluation approaches interpretability methods in the NLP field. Finally, we present a case-study on the well-known neural machine translation in an appendix and we propose promising future research directions for ExAI in the NLP field.

interpretability, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3529755

2210.06929

Country:

Asia > Middle East > Lebanon > Beirut Governorate > Beirut (0.04)
North America > United States > California > Alameda County > Livermore (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Overview (1.00)
Research Report > New Finding (0.67)

Industry:

Health & Medicine (0.67)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
(3 more...)

Add feedback