AITopics

2409.14842

Country:

Asia > Singapore (0.05)
Asia > China > Beijing > Beijing (0.04)

Genre:

Research Report (0.50)
Instructional Material > Course Syllabus & Notes (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsOct-7-2024, 16:14:00 GMT

Reviews: Decoding with Value Networks for Neural Machine Translation

This paper addresses one of the limitation of NMT, the so-called exposure bias, that results from the fact that each word is chosen greedily. For this, the authors build on standard technique of reinforcement learning and try to predict, for each outgoing transition of a given state, the expected reward that will be achieved if the system take this transition. The article is overall very clear and the proposed ideas quite appealing, even if many of the decisions seem quite ad hoc (e.g. More importantly, several implementation "details" are not specified. For instance, in Equation (6), the BLEU function is defined at the sentence level while in the actual BLEU metric is defined at the corpus level.

neural machine translation, prediction, value network, (9 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Neural Information Processing SystemsOct-7-2024, 10:12:10 GMT

Reviews: Layer-Wise Coordination between Encoder and Decoder for Neural Machine Translation

Original Review: This work builds directly off of Transformer networks. They make two contributions to that kind of architecture. The first is to suggest running the encoder and decoder stacks layer by layer instead of running the encoder stack and passing information to the decoder stack. The second is to actually tie the weights of the encoder and decoder. Running a decoder layer right after its corresponding encoder layer processes (rather than running the next encoder layer) is also an interesting augmentation to Transformer networks.

encoder and decoder, layer-wise coordination, neural machine translation, (5 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Neural Information Processing SystemsOct-7-2024, 09:55:53 GMT

Reviews: e-SNLI: Natural Language Inference with Natural Language Explanations

I think the idea of explicable models is worth pursuing, and this is a decent contribution to showing how one might do that. It is unfortunate that this work shows a huge tradeoff between models that perform at high levels and those that explain well (from 4.1 it seems like we can get good performance, but then can't generate correct explanations very often and from 4.2 we can generate correct explanations more often at the expense of good performance). It also seems disappointing that the BLEU scores in the PREDICT setting are already so close to the inter-annotator agreement even though they are not correct explanations very often; this seems to suggest that we really do need to rely on the percent correct given by human evaluation and that the BLEU scores are not very meaningful. This seems like a bottleneck for this resource being widely adopted. Nonetheless, these findings are a solid contribution and so is the data if others are willing to do human evaluation or work on a new automatic metric for a task like this.

bleu score, explanation, natural language explanation, (13 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.30)

Neural Information Processing SystemsOct-7-2024, 08:17:34 GMT

Reviews: Navigating with Graph Representations for Fast and Scalable Decoding of Neural Language Models

Update after author response: Thanks for the detailed response! It's a strong submission and I vote for an accept. This paper aims to speed up the computation of the softmax over a large vocabulary, which is quite common in some NLP tasks like e.g., language modeling. Specifically, the proposed method formulates the problem into a nearest neighbor search in a small world graph, and applies a log time algorithm to find the approximate top K predictions. The resulting time complexity reduces to logarithmic in the vocabulary size in expectation, in contrast to the linear one in a standard softmax.

fast and scalable decoding, graph representation, neural language model, (6 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.36)

Miceli-Barone, Antonio Valerio, Sun, Zhifan

A test suite of prompt injection attacks for LLM-based machine translation

LLM-based NLP systems typically work by embedding their input data into prompt templates which contain instructions and/or in-context examples, creating queries which are submitted to a LLM, and then parsing the LLM response in order to generate the system outputs. Prompt Injection Attacks (PIAs) are a type of subversion of these systems where a malicious user crafts special inputs which interfere with the prompt templates, causing the LLM to respond in ways unintended by the system designer. Recently, Sun and Miceli-Barone proposed a class of PIAs against LLM-based machine translation. Specifically, the task is to translate questions from the TruthfulQA test suite, where an adversarial prompt is prepended to the questions, instructing the system to ignore the translation instruction and answer the questions instead. In this test suite, we extend this approach to all the language pairs of the WMT 2024 General Machine Translation task. Moreover, we include additional attack formats in addition to the one originally studied.

large language model, machine learning, qm bw cw lid transl, (18 more...)

2410.05047

Country:

North America > United States (0.04)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
North America > Canada > Ontario > Toronto (0.04)
(4 more...)

Genre: Research Report (0.49)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Zhao, Haowen, Aprile, Francesco A., Bravi, Barbara

Computational design of target-specific linear peptide binders with TransformerBeta

The computational prediction and design of peptide binders targeting specific linear epitopes is crucial in biological and biomedical research, yet it remains challenging due to their highly dynamic nature and the scarcity of experimentally solved binding data. To address this problem, we built an unprecedentedly large-scale library of peptide pairs within stable secondary structures (beta sheets), leveraging newly available AlphaFold predicted structures. We then developed a machine learning method based on the Transformer architecture for the design of specific linear binders, in analogy to a language translation task. Our method, TransformerBeta, accurately predicts specific beta strand interactions and samples sequences with beta sheet-like molecular properties, while capturing interpretable physico-chemical interaction patterns. As such, it can propose specific candidate binders targeting linear epitope for experimental validation to inform protein design.

large language model, machine learning, natural language, (18 more...)

2410.16302

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France (0.04)

Genre: Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.67)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.88)

Raunak, Vikas, Grundkiewicz, Roman, Junczys-Dowmunt, Marcin

On Instruction-Finetuning Neural Machine Translation Models

In this work, we introduce instruction finetuning for Neural Machine Translation (NMT) models, which distills instruction following capabilities from Large Language Models (LLMs) into orders-of-magnitude smaller NMT models. Our instruction-finetuning recipe for NMT models enables customization of translations for a limited but disparate set of translation-specific tasks. We show that NMT models are capable of following multiple instructions simultaneously and demonstrate capabilities of zero-shot composition of instructions. We also show that through instruction finetuning, traditionally disparate tasks such as formality-controlled machine translation, multi-domain adaptation as well as multi-modal translations can be tackled jointly by a single instruction finetuned NMT model, at a performance level comparable to LLMs such as GPT-3.5-Turbo. To the best of our knowledge, our work is among the first to demonstrate the instruction-following capabilities of traditional NMT models, which allows for faster, cheaper and more efficient serving of customized translations.

instruction, nmt model, translation, (13 more...)

2410.05553

Country:

Europe > Czechia > Prague (0.05)
North America > Canada > Ontario > Toronto (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(9 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Asvarov, Alidar, Grabovoy, Andrey

Neural machine translation system for Lezgian, Russian and Azerbaijani languages

We release the first neural machine translation system for translation between Russian, Azerbaijani and the endangered Lezgian languages, as well as monolingual and parallel datasets collected and aligned for training and evaluating the system. Multiple experiments are conducted to identify how different sets of training language pairs and data domains can influence the resulting translation quality. We achieve BLEU scores of 26.14 for Lezgian-Azerbaijani, 22.89 for Azerbaijani-Lezgian, 29.48 for Lezgian-Russian and 24.25 for Russian-Lezgian pairs. The quality of zero-shot translation is assessed on a Large Language Model, showing its high level of fluency in Lezgian. However, the model often refuses to translate, justifying itself with its incompetence. We contribute our translation model along with the collected parallel and monolingual corpora and sentence encoder for the Lezgian language.

experiment, machine translation, translation, (15 more...)

2410.05472

Country:

Asia > Russia (0.31)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.05)
Asia > Azerbaijan (0.05)
(7 more...)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Perrella, Stefano, Proietti, Lorenzo, Cabot, Pere-Lluís Huguet, Barba, Edoardo, Navigli, Roberto

Beyond Correlation: Interpretable Evaluation of Machine Translation Metrics

Machine Translation (MT) evaluation metrics assess translation quality automatically. Recently, researchers have employed MT metrics for various new use cases, such as data filtering and translation re-ranking. However, most MT metrics return assessments as scalar scores that are difficult to interpret, posing a challenge to making informed design choices. Moreover, MT metrics' capabilities have historically been evaluated using correlation with human judgment, which, despite its efficacy, falls short of providing intuitive insights into metric performance, especially in terms of new metric use cases. To address these issues, we introduce an interpretable evaluation framework for MT metrics. Within this framework, we evaluate metrics in two scenarios that serve as proxies for the data filtering and translation re-ranking use cases. Furthermore, by measuring the performance of MT metrics using Precision, Recall, and F-score, we offer clearer insights into their capabilities than correlation with human judgments. Finally, we raise concerns regarding the reliability of manually curated data following the Direct Assessments+Scalar Quality Metrics (DA+SQM) guidelines, reporting a notably low agreement with Multidimensional Quality Metrics (MQM) annotations.

computational linguistic, metric, translation, (13 more...)

2410.05183

Country:

Asia > Singapore (0.05)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
North America > Canada > Ontario > Toronto (0.04)
(20 more...)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)