AITopics

2302.09424

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Hong Kong > Kowloon (0.04)
North America > United States > New York > New York County > New York City (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry: Consumer Products & Services (0.74)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Xu, Jitao, Crego, Josep, Yvon, François

Integrating Translation Memories into Non-Autoregressive Machine Translation

Non-autoregressive machine translation (NAT) has recently made great progress. However, most works to date have focused on standard translation tasks, even though some edit-based NAT models, such as the Levenshtein Transformer (LevT), seem well suited to translate with a Translation Memory (TM). This is the scenario considered here. We first analyze the vanilla LevT model and explain why it does not do well in this setting. We then propose a new variant, TM-LevT, and show how to effectively train this model. By modifying the data presentation and introducing an extra deletion operation, we obtain performance that are on par with an autoregressive approach, while reducing the decoding load. We also show that incorporating TMs during training dispenses to use knowledge distillation, a well-known trick used to mitigate the multimodality issue.

artificial intelligence, machine learning, natural language, (13 more...)

2210.0602

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(10 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

How Good Are GPT Models at Machine Translation? A Comprehensive Evaluation

Hendy, Amr, Abdelrehim, Mohamed, Sharaf, Amr, Raunak, Vikas, Gabr, Mohamed, Matsushita, Hitokazu, Kim, Young Jin, Afify, Mohamed, Awadalla, Hany Hassan

Generative Pre-trained Transformer (GPT) models have shown remarkable capabilities for natural language generation, but their performance for machine translation has not been thoroughly investigated. In this paper, we present a comprehensive evaluation of GPT models for machine translation, covering various aspects such as quality of different GPT models in comparison with state-of-the-art research and commercial systems, effect of prompting strategies, robustness towards domain shifts and document-level translation. We experiment with eighteen different translation directions involving high and low resource languages, as well as non English-centric translations, and evaluate the performance of three GPT models: ChatGPT, GPT3.5 (text-davinci-003), and text-davinci-002. Our results show that GPT models achieve very competitive translation quality for high resource languages, while having limited capabilities for low resource languages. We also show that hybrid approaches, which combine GPT models with other translation systems, can further enhance the translation quality. We perform comprehensive analysis and human evaluation to further understand the characteristics of GPT translations. We hope that our paper provides valuable insights for researchers and practitioners in the field and helps to better understand the potential and limitations of GPT models for translation.

large language model, machine learning, translation, (19 more...)

2302.0921

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
North America > United States > Arizona (0.04)
(7 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Alibaba-Translate China's Submission for WMT 2022 Metrics Shared Task

Wan, Yu, Bao, Keqin, Liu, Dayiheng, Yang, Baosong, Wong, Derek F., Chao, Lidia S., Lei, Wenqiang, Xie, Jun

In this report, we present our submission to the WMT 2022 Metrics Shared Task. We build our system based on the core idea of UNITE (Unified Translation Evaluation), which unifies source-only, reference-only, and source-reference-combined evaluation scenarios into one single model. Specifically, during the model pre-training phase, we first apply the pseudo-labeled data examples to continuously pre-train UNITE. Notably, to reduce the gap between pre-training and fine-tuning, we use data cropping and a ranking-based score normalization strategy. During the fine-tuning phase, we use both Direct Assessment (DA) and Multidimensional Quality Metrics (MQM) data from past years' WMT competitions. Specially, we collect the results from models with different pre-trained language model backbones, and use different ensembling strategies for involved translation directions.

computational linguistic, machine learning, natural language, (15 more...)

2210.09683

Country:

Asia > China (0.41)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.05)
(10 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Towards Fine-Grained Information: Identifying the Type and Location of Translation Errors

Bao, Keqin, Wan, Yu, Liu, Dayiheng, Yang, Baosong, Lei, Wenqiang, He, Xiangnan, Wong, Derek F., Xie, Jun

Fine-grained information on translation errors is helpful for the translation evaluation community. Existing approaches can not synchronously consider error position and type, failing to integrate the error information of both. In this paper, we propose Fine-Grained Translation Error Detection (FG-TED) task, aiming at identifying both the position and the type of translation errors on given source-hypothesis sentence pairs. Besides, we build an FG-TED model to predict the \textbf{addition} and \textbf{omission} errors -- two typical translation accuracy errors. First, we use a word-level classification paradigm to form our model and use the shortcut learning reduction to relieve the influence of monolingual features. Besides, we construct synthetic datasets for model training, and relieve the disagreement of data labeling in authoritative datasets, making the experimental benchmark concordant. Experiments show that our model can identify both error type and position concurrently, and gives state-of-the-art results on the restored dataset. Our model also delivers more reliable predictions on low-resource and transfer scenarios than existing baselines. The related datasets and the source code will be released in the future.

computational linguistic, machine learning, natural language, (19 more...)

2302.08975

Country:

Africa > Middle East > Somalia (0.14)
Asia > China > Beijing > Beijing (0.05)
Asia > China > Tianjin Province > Tianjin (0.04)
(16 more...)

Genre: Research Report (0.82)

Industry:

Transportation (0.68)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Alibaba-Translate China's Submission for WMT 2022 Quality Estimation Shared Task

Bao, Keqin, Wan, Yu, Liu, Dayiheng, Yang, Baosong, Lei, Wenqiang, He, Xiangnan, Wong, Derek F., Xie, Jun

In this paper, we present our submission to the sentence-level MQM benchmark at Quality Estimation Shared Task, named UniTE (Unified Translation Evaluation). Specifically, our systems employ the framework of UniTE, which combined three types of input formats during training with a pre-trained language model. First, we apply the pseudo-labeled data examples for the continuously pre-training phase. Notably, to reduce the gap between pre-training and fine-tuning, we use data pruning and a ranking-based score normalization strategy. For the fine-tuning phase, we use both Direct Assessment (DA) and Multidimensional Quality Metrics (MQM) data from past years' WMT competitions. Finally, we collect the source-only evaluation results, and ensemble the predictions generated by two UniTE models, whose backbones are XLM-R and InfoXLM, respectively. Results show that our models reach 1st overall ranking in the Multilingual and English-Russian settings, and 2nd overall ranking in English-German and Chinese-English settings, showing relatively strong performances in this year's quality estimation competition.

computational linguistic, machine learning, natural language, (15 more...)

2210.10049

Country:

Asia > China (0.41)
Europe > Ireland > Leinster > County Dublin > Dublin (0.05)
Asia > Macao (0.05)
(9 more...)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Zhang, Zhi, Yannakoudakis, Helen, Zhen, Xiantong, Shutova, Ekaterina

CK-Transformer: Commonsense Knowledge Enhanced Transformers for Referring Expression Comprehension

The task of multimodal referring expression comprehension (REC), aiming at localizing an image region described by a natural language expression, has recently received increasing attention within the research comminity. In this paper, we specifically focus on referring expression comprehension with commonsense knowledge (KB-Ref), a task which typically requires reasoning beyond spatial, visual or semantic information. We propose a novel framework for Commonsense Knowledge Enhanced Transformers (CK-Transformer) which effectively integrates commonsense knowledge into the representations of objects in an image, facilitating identification of the target objects referred to by the expressions. We conduct extensive experiments on several benchmarks for the task of KB-Ref. Our results show that the proposed CK-Transformer achieves a new state of the art, with an absolute improvement of 3.14% accuracy over the existing state of the art.

artificial intelligence, natural language, text processing, (19 more...)

2302.09027

Country: Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.66)

Paaß, Gerhard, Giesselbach, Sven

Foundation Models for Natural Language Processing -- Pre-trained Language Models Integrating Media

arXiv.org Artificial IntelligenceFeb-16-2023

This open access book provides a comprehensive overview of the state of the art in research and applications of Foundation Models and is intended for readers familiar with basic Natural Language Processing (NLP) concepts. Over the recent years, a revolutionary new paradigm has been developed for training models for NLP. These models are first pre-trained on large collections of text documents to acquire general syntactic knowledge and semantic information. Then, they are fine-tuned for specific tasks, which they can often solve with superhuman accuracy. When the models are large enough, they can be instructed by prompts to solve new tasks without any fine-tuning. Moreover, they can be applied to a wide range of different media and problem domains, ranging from image and video processing to robot control learning. Because they provide a blueprint for solving many tasks in artificial intelligence, they have been called Foundation Models. After a brief introduction to basic NLP models the main pre-trained language models BERT, GPT and sequence-to-sequence transformer are described, as well as the concepts of self-attention and context-sensitive embedding. Then, different approaches to improving these models are discussed, such as expanding the pre-training criteria, increasing the length of input texts, or including extra knowledge. An overview of the best-performing models for about twenty application areas is then presented, e.g., question answering, translation, story generation, dialog systems, generating images from text, etc. For each application area, the strengths and weaknesses of current models are discussed, and an outlook on further developments is given. In addition, links are provided to freely available program code. A concluding chapter summarizes the economic opportunities, mitigation of risks, and potential developments of AI.

large language model, machine learning, pattern recognition, (32 more...)

2302.08575

Country:

Europe > Ukraine > Kyiv Oblast > Kyiv (0.13)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.13)
North America > Canada > Ontario > Toronto (0.13)
(43 more...)

Genre:

Workflow (1.00)
Summary/Review (1.00)
Research Report > Promising Solution (1.00)
(4 more...)

Industry:

Transportation > Passenger (1.00)
Media > Television (1.00)
Media > News (1.00)
(21 more...)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
(23 more...)

Yehudai, Asaf, Cattan, Arie, Abend, Omri, Stanovsky, Gabriel

Evaluating and Improving the Coreference Capabilities of Machine Translation Models

arXiv.org Artificial IntelligenceFeb-16-2023

Machine translation (MT) requires a wide range of linguistic capabilities, which current end-to-end models are expected to learn implicitly by observing aligned sentences in bilingual corpora. In this work, we ask: \emph{How well do MT models learn coreference resolution from implicit signal?} To answer this question, we develop an evaluation methodology that derives coreference clusters from MT output and evaluates them without requiring annotations in the target language. We further evaluate several prominent open-source and commercial MT systems, translating from English to six target languages, and compare them to state-of-the-art coreference resolvers on three challenging benchmarks. Our results show that the monolingual resolvers greatly outperform MT models. Motivated by this result, we experiment with different methods for incorporating the output of coreference resolution models in MT, showing improvement over strong baselines.

artificial intelligence, natural language, translation, (15 more...)

2302.08464

Country:

Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)
(13 more...)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

#artificialintelligenceFeb-15-2023, 19:15:08 GMT

New developments in Machine Translation part3

Abstract: Deep neural networks have been shown to be vulnerable to small perturbations of their inputs, known as adversarial attacks. In this paper, we investigate the vulnerability of Neural Machine Translation (NMT) models to adversarial attacks and propose a new attack algorithm called TransFool. To fool NMT models, TransFool builds on a multi-term optimization problem and a gradient projection step. By integrating the embedding representation of a language model, we generate fluent adversarial examples in the source language that maintain a high level of semantic similarity with the clean samples. Experimental results demonstrate that, for different translation tasks and NMT architectures, our white-box attack can severely degrade the translation quality while the semantic similarity between the original and the adversarial sentences stays high.

machine translation part3, semantic similarity, translation task, (7 more...)

#artificialintelligence

Genre: Research Report > New Finding (0.39)

Industry: Information Technology > Security & Privacy (0.81)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.39)