AITopics

2309.08943

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Florida (0.05)
(18 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.94)

Rodriguez, Juan Diego, Erk, Katrin, Durrett, Greg

X-PARADE: Cross-Lingual Textual Entailment and Information Divergence across Paragraphs

arXiv.org Artificial IntelligenceSep-16-2023

Understanding when two pieces of text convey the same information is a goal touching many subproblems in NLP, including textual entailment and fact-checking. This problem becomes more complex when those two pieces of text are in different languages. Here, we introduce X-PARADE (Cross-lingual Paragraph-level Analysis of Divergences and Entailments), the first cross-lingual dataset of paragraph-level information divergences. Annotators label a paragraph in a target language at the span level and evaluate it with respect to a corresponding paragraph in a source language, indicating whether a given piece of information is the same, new, or new but can be inferred. This last notion establishes a link with cross-language NLI. Aligned paragraphs are sourced from Wikipedia pages in different languages, reflecting real information divergences observed in the wild. Armed with our dataset, we investigate a diverse set of approaches for this problem, including classic token alignment from machine translation, textual entailment methods that localize their decisions, and prompting of large language models. Our results show that these methods vary in their capability to handle inferable information, but they all fall short of human performance.

computational linguistic, linguistic, paragraph, (16 more...)

2309.08873

Country:

Asia > Middle East > Iraq (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
(22 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Government > Military (0.68)
Health & Medicine > Therapeutic Area (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Rigaki, Maria, Garcia, Sebastian

A Survey of Privacy Attacks in Machine Learning

arXiv.org Artificial IntelligenceSep-16-2023

As machine learning becomes more widely used, the need to study its implications in security and privacy becomes more urgent. Although the body of work in privacy has been steadily growing over the past few years, research on the privacy aspects of machine learning has received less focus than the security aspects. Our contribution in this research is an analysis of more than 40 papers related to privacy attacks against machine learning that have been published during the past seven years. We propose an attack taxonomy, together with a threat model that allows the categorization of different attacks based on the adversarial knowledge, and the assets under attack. An initial exploration of the causes of privacy leaks is presented, as well as a detailed analysis of the different attacks. Finally, we present an overview of the most commonly proposed defenses and a discussion of the open problems and future directions identified during our analysis.

adversary, dataset, learning, (14 more...)

doi: 10.1145/3624010

2007.07646

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.14)
North America > United States > New York > New York County > New York City (0.05)
(35 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(3 more...)

SLIDE: Reference-free Evaluation for Machine Translation using a Sliding Document Window

Raunak, Vikas, Kocmi, Tom, Post, Matt

We are The prevailing approach for neural machine translation motivated by different but related ideas: (i) neural metrics is to work at the sentence level, metrics often make use of underlying language constructing sequences of contextualized encoder models trained on wider contexts, which means states from the source sentence, a reference translation, there is no real impediment to feeding them multiple and a system output. The specific mechanics sentences, and (ii) a sentence's evaluation will vary by metric, but a general approach, employed differ based on its order in a block of sentences, by COMET (Rei et al., 2020b), is to pool these encodings so it may be helpful to evaluate each sentence in into separate sentence-level embeddings, multiple different contexts. In this work, we therefore concatenate them, and fed them into a regressor, experiment with a strided window approach which is trained against human annotations. Quality applied to COMET, whose underlying encoder is Estimation (QE) approaches work similarly, but InfoXLM (Lample and Conneau, 2019; Chi et al., do not have access to a reference translation.

computational linguistic, machine translation, translation, (12 more...)

2309.08832

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.05)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Washington > King County > Redmond (0.04)
(6 more...)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Reinauer, Raphael, Simianer, Patrick, Uhlig, Kaden, Mosig, Johannes E. M., Wuebker, Joern

Neural Machine Translation Models Can Learn to be Few-shot Learners

The emergent ability of Large Language Models to use a small number of examples to learn to perform in novel domains and tasks, also called in-context learning (ICL). In this work, we show that a much smaller model can be trained to perform ICL by fine-tuning towards a specialized training objective, exemplified on the task of domain adaptation for neural machine translation. With this capacity for ICL, the model can take advantage of relevant few-shot examples to adapt its output towards the domain. We compare the quality of this domain adaptation to traditional supervised techniques and ICL with a 40B-parameter Large Language Model. Our approach allows efficient batch inference on a mix of domains and outperforms state-of-the-art baselines in terms of both translation quality and immediate adaptation rate, i.e. the ability to reproduce a specific term after being shown a single example.

experiment, tage 2, translation, (13 more...)

2309.0859

Country:

Europe > Finland > Pirkanmaa > Tampere (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(5 more...)

Genre:

Research Report (0.64)
Instructional Material (0.54)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-training and Multi-modal Tokens

Kim, Minsu, Choi, Jeongsoo, Maiti, Soumi, Yeo, Jeong Hun, Watanabe, Shinji, Ro, Yong Man

In this paper, we propose methods to build a powerful and efficient Image-to-Speech captioning (Im2Sp) model. To this end, we start with importing the rich knowledge related to image comprehension and language modeling from a large-scale pre-trained vision-language model into Im2Sp. We set the output of the proposed Im2Sp as discretized speech units, i.e., the quantized speech features of a self-supervised speech model. The speech units mainly contain linguistic information while suppressing other characteristics of speech. This allows us to incorporate the language modeling capability of the pre-trained vision-language model into the spoken language modeling of Im2Sp. With the vision-language pre-training strategy, we set new state-of-the-art Im2Sp performances on two widely used benchmark databases, COCO and Flickr8k. Then, we further improve the efficiency of the Im2Sp model. Similar to the speech unit case, we convert the original image into image units, which are derived through vector quantization of the raw image. With these image units, we can drastically reduce the required data storage for saving image data to just 0.8% when compared to the original image data in terms of bits. Demo page: https://ms-dot-k.github.io/Image-to-Speech-Captioning.

decoder, im2sp model, proc, (13 more...)

2309.08531

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > South Korea (0.04)
Africa > Middle East > Morocco > Casablanca-Settat Region > Casablanca (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.69)
(2 more...)

M'hamdi, Meryem, May, Jonathan, Dernoncourt, Franck, Bui, Trung, Yoon, Seunghyun

Multilingual Sentence-Level Semantic Search using Meta-Distillation Learning

Multilingual semantic search is the task of retrieving relevant contents to a query expressed in different language combinations. This requires a better semantic understanding of the user's intent and its contextual meaning. Multilingual semantic search is less explored and more challenging than its monolingual or bilingual counterparts, due to the lack of multilingual parallel resources for this task and the need to circumvent "language bias". In this work, we propose an alignment approach: MAML-Align, specifically for low-resource scenarios. Our approach leverages meta-distillation learning based on MAML, an optimization-based Model-Agnostic Meta-Learner. MAML-Align distills knowledge from a Teacher meta-transfer model T-MAML, specialized in transferring from monolingual to bilingual semantic search, to a Student model S-MAML, which meta-transfers from bilingual to multilingual semantic search. To the best of our knowledge, we are the first to extend meta-distillation to a multilingual search application. Our empirical results show that on top of a strong baseline based on sentence transformers, our meta-distillation approach boosts the gains provided by MAML and significantly outperforms naive fine-tuning methods. Furthermore, multilingual meta-distillation learning improves generalization even to unseen languages.

computational linguistic, evaluation, semantic search, (14 more...)

2309.08185

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(18 more...)

Genre: Research Report > New Finding (0.48)

Industry: Education (0.34)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)

arXiv.org Artificial IntelligenceSep-14-2023

Speech-to-Speech Translation with Discrete-Unit-Based Style Transfer

Wang, Yongqi, Bai, Jionghao, Huang, Rongjie, Li, Ruiqi, Hong, Zhiqing, Zhao, Zhou

Direct speech-to-speech translation (S2ST) with discrete self-supervised representations has achieved remarkable accuracy, but is unable to preserve the speaker timbre of the source speech during translation. Meanwhile, the scarcity of high-quality speaker-parallel data poses a challenge for learning style transfer between source and target speech. We propose an S2ST framework with an acoustic language model based on discrete units from a self-supervised model and a neural codec for style transfer. The acoustic language model leverages self-supervised in-context learning, acquiring the ability for style transfer without relying on any speaker-parallel data, thereby overcoming the issue of data scarcity. By using extensive training data, our model achieves zero-shot cross-lingual style transfer on previously unseen source languages. Experiments show that our model generates translated speeches with high fidelity and style similarity. Audio samples are available at http://stylelm.github.io/ .

representation, speech, translation, (12 more...)

2309.07566

Country:

Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > China (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.87)

arXiv.org Artificial IntelligenceSep-14-2023

Direct Text to Speech Translation System using Acoustic Units

Mingote, Victoria, Gimeno, Pablo, Vicente, Luis, Khurana, Sameer, Laurent, Antoine, Duret, Jarod

This paper proposes a direct text to speech translation system using discrete acoustic units. This framework employs text in different source languages as input to generate speech in the target language without the need for text transcriptions in this language. Motivated by the success of acoustic units in previous works for direct speech to speech translation systems, we use the same pipeline to extract the acoustic units using a speech encoder combined with a clustering algorithm. Once units are obtained, an encoder-decoder architecture is trained to predict them. Then a vocoder generates speech from units. Our approach for direct text to speech translation was tested on the new CVSS corpus with two different text mBART models employed as initialisation. The systems presented report competitive performance for most of the language pairs evaluated. Besides, results show a remarkable improvement when initialising our proposed architecture with a model pre-trained with more languages.

acoustic unit, speech, translation, (14 more...)

doi: 10.1109/LSP.2023.3313513

2309.07478

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > France (0.04)
Europe > Spain > Aragón > Zaragoza Province > Zaragoza (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)

Guo, Shoutao, Zhang, Shaolei, Feng, Yang

Glancing Future for Simultaneous Machine Translation

arXiv.org Artificial IntelligenceSep-12-2023

Simultaneous machine translation (SiMT) outputs translation while reading the source sentence. Unlike conventional sequence-to-sequence (seq2seq) training, existing SiMT methods adopt the prefix-to-prefix (prefix2prefix) training, where the model predicts target tokens based on partial source tokens. However, the prefix2prefix training diminishes the ability of the model to capture global information and introduces forced predictions due to the absence of essential source information. Consequently, it is crucial to bridge the gap between the prefix2prefix training and seq2seq training to enhance the translation capability of the SiMT model. In this paper, we propose a novel method that glances future in curriculum learning to achieve the transition from the seq2seq training to prefix2prefix training. Specifically, we gradually reduce the available source information from the whole sentence to the prefix corresponding to that latency. Our method is applicable to a wide range of SiMT methods and experiments demonstrate that our method outperforms strong baselines.

artificial intelligence, natural language, translation, (17 more...)

2309.06179

Country: Asia > China (0.14)

Genre: Research Report (0.70)

Industry:

Materials > Chemicals > Industrial Gases > Liquified Gas (0.47)
Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (0.47)
Energy > Oil & Gas > Midstream (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)