AITopics

2302.08975

Country:

Africa > Middle East > Somalia (0.14)
Asia > China > Beijing > Beijing (0.05)
Asia > China > Tianjin Province > Tianjin (0.04)
(16 more...)

Genre: Research Report (0.82)

Industry:

Transportation (0.68)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

arXiv.org Artificial IntelligenceFeb-17-2023

Alibaba-Translate China's Submission for WMT 2022 Quality Estimation Shared Task

Bao, Keqin, Wan, Yu, Liu, Dayiheng, Yang, Baosong, Lei, Wenqiang, He, Xiangnan, Wong, Derek F., Xie, Jun

In this paper, we present our submission to the sentence-level MQM benchmark at Quality Estimation Shared Task, named UniTE (Unified Translation Evaluation). Specifically, our systems employ the framework of UniTE, which combined three types of input formats during training with a pre-trained language model. First, we apply the pseudo-labeled data examples for the continuously pre-training phase. Notably, to reduce the gap between pre-training and fine-tuning, we use data pruning and a ranking-based score normalization strategy. For the fine-tuning phase, we use both Direct Assessment (DA) and Multidimensional Quality Metrics (MQM) data from past years' WMT competitions. Finally, we collect the source-only evaluation results, and ensemble the predictions generated by two UniTE models, whose backbones are XLM-R and InfoXLM, respectively. Results show that our models reach 1st overall ranking in the Multilingual and English-Russian settings, and 2nd overall ranking in English-German and Chinese-English settings, showing relatively strong performances in this year's quality estimation competition.

computational linguistic, machine learning, natural language, (15 more...)

2210.10049

Country:

Asia > China (0.41)
Europe > Ireland > Leinster > County Dublin > Dublin (0.05)
Asia > Macao (0.05)
(9 more...)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Zhang, Zhi, Yannakoudakis, Helen, Zhen, Xiantong, Shutova, Ekaterina

CK-Transformer: Commonsense Knowledge Enhanced Transformers for Referring Expression Comprehension

arXiv.org Artificial IntelligenceFeb-17-2023

The task of multimodal referring expression comprehension (REC), aiming at localizing an image region described by a natural language expression, has recently received increasing attention within the research comminity. In this paper, we specifically focus on referring expression comprehension with commonsense knowledge (KB-Ref), a task which typically requires reasoning beyond spatial, visual or semantic information. We propose a novel framework for Commonsense Knowledge Enhanced Transformers (CK-Transformer) which effectively integrates commonsense knowledge into the representations of objects in an image, facilitating identification of the target objects referred to by the expressions. We conduct extensive experiments on several benchmarks for the task of KB-Ref. Our results show that the proposed CK-Transformer achieves a new state of the art, with an absolute improvement of 3.14% accuracy over the existing state of the art.

artificial intelligence, natural language, text processing, (19 more...)

2302.09027

Country: Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.66)

Paaß, Gerhard, Giesselbach, Sven

Foundation Models for Natural Language Processing -- Pre-trained Language Models Integrating Media

arXiv.org Artificial IntelligenceFeb-16-2023

This open access book provides a comprehensive overview of the state of the art in research and applications of Foundation Models and is intended for readers familiar with basic Natural Language Processing (NLP) concepts. Over the recent years, a revolutionary new paradigm has been developed for training models for NLP. These models are first pre-trained on large collections of text documents to acquire general syntactic knowledge and semantic information. Then, they are fine-tuned for specific tasks, which they can often solve with superhuman accuracy. When the models are large enough, they can be instructed by prompts to solve new tasks without any fine-tuning. Moreover, they can be applied to a wide range of different media and problem domains, ranging from image and video processing to robot control learning. Because they provide a blueprint for solving many tasks in artificial intelligence, they have been called Foundation Models. After a brief introduction to basic NLP models the main pre-trained language models BERT, GPT and sequence-to-sequence transformer are described, as well as the concepts of self-attention and context-sensitive embedding. Then, different approaches to improving these models are discussed, such as expanding the pre-training criteria, increasing the length of input texts, or including extra knowledge. An overview of the best-performing models for about twenty application areas is then presented, e.g., question answering, translation, story generation, dialog systems, generating images from text, etc. For each application area, the strengths and weaknesses of current models are discussed, and an outlook on further developments is given. In addition, links are provided to freely available program code. A concluding chapter summarizes the economic opportunities, mitigation of risks, and potential developments of AI.

large language model, machine learning, pattern recognition, (32 more...)

2302.08575

Country:

Europe > Ukraine > Kyiv Oblast > Kyiv (0.13)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.13)
North America > Canada > Ontario > Toronto (0.13)
(43 more...)

Genre:

Workflow (1.00)
Summary/Review (1.00)
Research Report > Promising Solution (1.00)
(4 more...)

Industry:

Transportation > Passenger (1.00)
Media > Television (1.00)
Media > News (1.00)
(21 more...)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
(23 more...)

Yehudai, Asaf, Cattan, Arie, Abend, Omri, Stanovsky, Gabriel

Evaluating and Improving the Coreference Capabilities of Machine Translation Models

arXiv.org Artificial IntelligenceFeb-16-2023

Machine translation (MT) requires a wide range of linguistic capabilities, which current end-to-end models are expected to learn implicitly by observing aligned sentences in bilingual corpora. In this work, we ask: \emph{How well do MT models learn coreference resolution from implicit signal?} To answer this question, we develop an evaluation methodology that derives coreference clusters from MT output and evaluates them without requiring annotations in the target language. We further evaluate several prominent open-source and commercial MT systems, translating from English to six target languages, and compare them to state-of-the-art coreference resolvers on three challenging benchmarks. Our results show that the monolingual resolvers greatly outperform MT models. Motivated by this result, we experiment with different methods for incorporating the output of coreference resolution models in MT, showing improvement over strong baselines.

artificial intelligence, natural language, translation, (15 more...)

2302.08464

Country:

Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)
(13 more...)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

#artificialintelligenceFeb-15-2023, 19:15:08 GMT

New developments in Machine Translation part3

Abstract: Deep neural networks have been shown to be vulnerable to small perturbations of their inputs, known as adversarial attacks. In this paper, we investigate the vulnerability of Neural Machine Translation (NMT) models to adversarial attacks and propose a new attack algorithm called TransFool. To fool NMT models, TransFool builds on a multi-term optimization problem and a gradient projection step. By integrating the embedding representation of a language model, we generate fluent adversarial examples in the source language that maintain a high level of semantic similarity with the clean samples. Experimental results demonstrate that, for different translation tasks and NMT architectures, our white-box attack can severely degrade the translation quality while the semantic similarity between the original and the adversarial sentences stays high.

machine translation part3, semantic similarity, translation task, (7 more...)

#artificialintelligence

Genre: Research Report > New Finding (0.39)

Industry: Information Technology > Security & Privacy (0.81)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.39)

ANSEL Photobot: A Robot Event Photographer with Semantic Intelligence

Rivkin, Dmitriy, Dudek, Gregory, Kakodkar, Nikhil, Meger, David, Limoyo, Oliver, Liu, Xue, Hogan, Francois

Our work examines the way in which large language models can be used for robotic planning and sampling, specifically the context of automated photographic documentation. Specifically, we illustrate how to produce a photo-taking robot with an exceptional level of semantic awareness by leveraging recent advances in general purpose language (LM) and vision-language (VLM) models. Given a high-level description of an event we use an LM to generate a natural-language list of photo descriptions that one would expect a photographer to capture at the event. We then use a VLM to identify the best matches to these descriptions in the robot's video stream. The photo portfolios generated by our method are consistently rated as more appropriate to the event by human evaluators than those generated by existing methods.

artificial intelligence, large language model, natural language, (18 more...)

2302.07931

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Media > Photography (1.00)
Consumer Products & Services > Food, Beverage, Tobacco & Cannabis > Beverages (0.47)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

Lindemann, Matthias, Koller, Alexander, Titov, Ivan

Compositional Generalisation with Structured Reordering and Fertility Layers

Seq2seq models have been shown to struggle with compositional generalisation, i.e. generalising to new and potentially more complex structures than seen during training. Taking inspiration from grammar-based models that excel at compositional generalisation, we present a flexible end-to-end differentiable neural model that composes two structural operations: a fertility step, which we introduce in this work, and a reordering step based on previous work (Wang et al., 2021). To ensure differentiability, we use the expected value of each step. Our model outperforms seq2seq models by a wide margin on challenging compositional splits of realistic semantic parsing tasks that require generalisation to longer examples. It also compares favourably to other models targeting compositional generalisation.

artificial intelligence, machine learning, natural language, (20 more...)

2210.03183

Country:

North America > United States > Texas (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > New York (0.04)
(13 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.33)

Zhangxiaobing, null, Tangzhenhao, null, Longzi, null, Fuxianghua, null

Generalization algorithm of multimodal pre-training model based on graph-text self-supervised training

Recently, a large number of studies have shown that the introduction of visual information can effectively improve the effect of neural machine translation (NMT). Its effectiveness largely depends on the availability of a large number of bilingual parallel sentence pairs and manual image annotation. The lack of images and the effectiveness of images have been difficult to solve. In this paper, a multimodal pre-training generalization algorithm for self-supervised training is proposed, which overcomes the lack of visual information and inaccuracy, and thus extends the applicability of images on NMT. Specifically, we will search for many pictures from the existing sentences through the search engine, and then through the relationship between visual information and text, do the self-supervised training task of graphics and text to obtain more effective visual information for text. We show that when the filtered information is used as multimodal machine translation for fine-tuning, the effect of translation in the global voice dataset is 0.5 BLEU higher than the baseline.

information, machine learning, natural language, (17 more...)

2302.10315

Country: Asia > China > Guangdong Province > Shenzhen (0.07)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Wu, Minghao, Foster, George, Qu, Lizhen, Haffari, Gholamreza

Document Flattening: Beyond Concatenating Context for Document-Level Neural Machine Translation

Existing work in document-level neural machine translation commonly concatenates several consecutive sentences as a pseudo-document, and then learns inter-sentential dependencies. This strategy limits the model's ability to leverage information from distant context. We overcome this limitation with a novel Document Flattening (DocFlat) technique that integrates Flat-Batch Attention (FBA) and Neural Context Gate (NCG) into Transformer model to utilize information beyond the pseudo-document boundaries. FBA allows the model to attend to all the positions in the batch and learns the relationships between positions explicitly and NCG identifies the useful information from the distant context. We conduct comprehensive experiments and analyses on three benchmark datasets for English-German translation, and validate the effectiveness of two variants of DocFlat. Empirical results show that our approach outperforms strong baselines with statistical significance on BLEU, COMET and accuracy on the contrastive test set. The analyses highlight that DocFlat is highly effective in capturing the long-range information.

artificial intelligence, computational linguistic, natural language, (15 more...)

2302.08079

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(20 more...)

Genre:

Research Report > New Finding (0.66)
Research Report > Experimental Study (0.48)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)