AITopics

2306.08492

Country:

Europe > Switzerland > Vaud > Lausanne (0.04)
Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
Europe > Belgium (0.04)
Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)

Genre: Research Report > New Finding (0.48)

Industry:

Information Technology > Security & Privacy (0.73)
Government > Military (0.63)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Segonne, Vincent, Mickus, Timothee

"Definition Modeling: To model definitions." Generating Definitions With Little to No Semantics

arXiv.org Artificial IntelligenceJun-14-2023

Definition Modeling, the task of generating definitions, was first proposed as a means to evaluate the semantic quality of word embeddings-a coherent lexical semantic representations of a word in context should contain all the information necessary to generate its definition. The relative novelty of this task entails that we do not know which factors are actually relied upon by a Definition Modeling system. In this paper, we present evidence that the task may not involve as much semantics as one might expect: we show how an earlier model from the literature is both rather insensitive to semantic aspects such as explicit polysemy, as well as reliant on formal similarities between headwords and words occurring in its glosses, casting doubt on the validity of the task as a means to evaluate embeddings.

computational linguistic, machine learning, natural language, (19 more...)

2306.08433

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.05)
North America > United States > New York > New York County > New York City (0.04)
(23 more...)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.88)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.69)

Tang, Joël, Fomicheva, Marina, Specia, Lucia

Reducing Hallucinations in Neural Machine Translation with Feature Attribution

arXiv.org Artificial IntelligenceJun-14-2023

Neural conditional language generation models achieve the state-of-the-art in Neural Machine Translation (NMT) but are highly dependent on the quality of parallel training dataset. When trained on low-quality datasets, these models are prone to various error types, including hallucinations, i.e. outputs that are fluent, but unrelated to the source sentences. These errors are particularly dangerous, because on the surface the translation can be perceived as a correct output, especially if the reader does not understand the source language. We present a case study focusing on model understanding and regularisation to reduce hallucinations in NMT. We first use feature attribution methods to study the behaviour of an NMT model that produces hallucinations. We then leverage these methods to propose a novel loss function that substantially helps reduce hallucinations and does not require retraining the model from scratch.

artificial intelligence, machine translation, natural language, (15 more...)

2211.09878

Country:

Asia > North Korea (0.14)
North America > United States (0.04)
Europe > Middle East > Malta (0.04)
(4 more...)

Genre: Research Report (0.50)

Industry: Law (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Modality Adaption or Regularization? A Case Study on End-to-End Speech Translation

Han, Yuchen, Xu, Chen, Xiao, Tong, Zhu, Jingbo

Pre-training and fine-tuning is a paradigm for alleviating the data scarcity problem in end-to-end speech translation (E2E ST). The commonplace "modality gap" between speech and text data often leads to inconsistent inputs between pre-training and fine-tuning. However, we observe that this gap occurs in the early stages of fine-tuning, but does not have a major impact on the final performance. On the other hand, we find that there has another gap, which we call the "capacity gap": high resource tasks (such as ASR and MT) always require a large model to fit, when the model is reused for a low resource task (E2E ST), it will get a sub-optimal performance due to the over-fitting. In a case study, we find that the regularization plays a more important role than the well-designed modality adaption method, which achieves 29.0 for en-de and 40.3 for en-fr on the MuST-C dataset. Code and models are available at https://github.com/hannlp/TAB.

computational linguistic, fine-tuning, speech translation, (13 more...)

2306.0765

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > China > Liaoning Province > Shenyang (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(10 more...)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Needle in a Haystack: An Analysis of High-Agreement Workers on MTurk for Summarization

Zhang, Lining, Mille, Simon, Hou, Yufang, Deutsch, Daniel, Clark, Elizabeth, Liu, Yixin, Mahamood, Saad, Gehrmann, Sebastian, Clinciu, Miruna, Chandu, Khyathi, Sedoc, João

To prevent the costly and inefficient use of resources on low-quality annotations, we want a method for creating a pool of dependable annotators who can effectively complete difficult tasks, such as evaluating automatic summarization. Thus, we investigate the recruitment of high-quality Amazon Mechanical Turk workers via a two-step pipeline. We show that we can successfully filter out subpar workers before they carry out the evaluations and obtain high-agreement annotations with similar constraints on resources. Although our workers demonstrate a strong consensus among themselves and CloudResearch workers, their alignment with expert judgments on a subset of the data is not as expected and needs further training in correctness. This paper still serves as a best practice for the recruitment of qualified annotators in other challenging annotation tasks.

large language model, machine learning, natural language, (24 more...)

2212.10397

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Florida > Palm Beach County > Palm Beach (0.04)
(19 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Leisure & Entertainment (1.00)
Law (1.00)
Media > Film (0.68)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.48)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
(2 more...)

Samuel, David, Øvrelid, Lilja

Tokenization with Factorized Subword Encoding

In recent years, language models have become increasingly larger and more complex. However, the input representations for these models continue to rely on simple and greedy subword tokenization methods. In this paper, we propose a novel tokenization method that factorizes subwords onto discrete triplets using a VQ-VAE model. The effectiveness of the proposed tokenization method, referred to as the Factorizer, is evaluated on language modeling and morpho-syntactic tasks for 7 diverse languages. Results indicate that this method is more appropriate and robust for morphological tasks than the commonly used byte-pair encoding (BPE) tokenization algorithm.

artificial intelligence, machine learning, natural language, (21 more...)

2306.07764

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(7 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

Gow-Smith, Edward, Berard, Alexandre, Boito, Marcely Zanon, Calapodescu, Ioan

NAVER LABS Europe's Multilingual Speech Translation Systems for the IWSLT 2023 Low-Resource Track

This paper presents NAVER LABS Europe's systems for Tamasheq-French and Quechua-Spanish speech translation in the IWSLT 2023 Low-Resource track. Our work attempts to maximize translation quality in low-resource settings using multilingual parameter-efficient solutions that leverage strong pre-trained models. Our primary submission for Tamasheq outperforms the previous state of the art by 7.5 BLEU points on the IWSLT 2022 test set, and achieves 23.6 BLEU on this year's test set, outperforming the second best participant by 7.7 points. For Quechua, we also rank first and achieve 17.7 BLEU, despite having only two hours of translation data. Finally, we show that our proposed multilingual architecture is also competitive for high-resource languages, outperforming the best unconstrained submission to the IWSLT 2021 Multilingual track, despite using much less training data and compute.

artificial intelligence, natural language, submission, (18 more...)

2306.07763

Country:

Africa > Mali (0.06)
Africa > Niger > Niamey > Niamey (0.05)
Africa > Nigeria (0.04)
(8 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Soft Language Clustering for Multilingual Model Pre-training

Zeng, Jiali, Jiang, Yufan, Yin, Yongjing, Jing, Yi, Meng, Fandong, Lin, Binghuai, Cao, Yunbo, Zhou, Jie

Multilingual pre-trained language models have demonstrated impressive (zero-shot) cross-lingual transfer abilities, however, their performance is hindered when the target language has distant typology from source languages or when pre-training data is limited in size. In this paper, we propose XLM-P, which contextually retrieves prompts as flexible guidance for encoding instances conditionally. Our XLM-P enables (1) lightweight modeling of language-invariant and language-specific knowledge across languages, and (2) easy integration with other multilingual pre-training methods. On the tasks of XTREME including text classification, sequence labeling, question answering, and sentence retrieval, both base- and large-size language models pre-trained with our proposed method exhibit consistent performance improvement. Furthermore, it provides substantial advantages for low-resource languages in unsupervised sentence retrieval and for target languages that differ greatly from the source language in cross-lingual transfer.

computational linguistic, large language model, machine learning, (20 more...)

2306.0761

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Dominican Republic (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(11 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.49)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.47)

PolyVoice: Language Models for Speech to Speech Translation

Dong, Qianqian, Huang, Zhiying, Tian, Qiao, Xu, Chen, Ko, Tom, Zhao, Yunlong, Feng, Siyuan, Li, Tang, Wang, Kexin, Cheng, Xuxin, Yue, Fengpeng, Bai, Ye, Chen, Xi, Lu, Lu, Ma, Zejun, Wang, Yuping, Wang, Mingxuan, Wang, Yuxuan

We propose PolyVoice, a language model-based framework for speech-to-speech translation (S2ST) system. Our framework consists of two language models: a translation language model and a speech synthesis language model. We use discretized speech units, which are generated in a fully unsupervised way, and thus our framework can be used for unwritten languages. For the speech synthesis part, we adopt the existing VALL-E X approach and build a unit-based audio language model. This grants our framework the ability to preserve the voice characteristics and the speaking style of the original speech. We examine our system on Chinese $\rightarrow$ English and English $\rightarrow$ Spanish pairs. Experimental results show that our system can generate speech with high translation quality and audio quality. Speech samples are available at https://speechtranslation.github.io/polyvoice.

artificial intelligence, machine translation, natural language, (17 more...)

2306.02982

Country:

Asia > Singapore (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(5 more...)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

arXiv.org Artificial IntelligenceJun-12-2023

Online Prototype Alignment for Few-shot Policy Transfer

Yi, Qi, Zhang, Rui, Peng, Shaohui, Guo, Jiaming, Gao, Yunkai, Yuan, Kaizhao, Chen, Ruizhi, Lan, Siming, Hu, Xing, Du, Zidong, Zhang, Xishan, Guo, Qi, Chen, Yunji

Domain adaptation in reinforcement learning (RL) mainly deals with the changes of observation when transferring the policy to a new environment. Many traditional approaches of domain adaptation in RL manage to learn a mapping function between the source and target domain in explicit or implicit ways. However, they typically require access to abundant data from the target domain. Besides, they often rely on visual clues to learn the mapping function and may fail when the source domain looks quite different from the target domain. To address these problems, we propose a novel framework Online Prototype Alignment (OPA) to learn the mapping function based on the functional similarity of elements and is able to achieve the few-shot policy transfer within only several episodes. The key insight of OPA is to introduce an exploration mechanism that can interact with the unseen elements of the target domain in an efficient and purposeful manner, and then connect them with the seen elements in the source domain according to their functionalities (instead of visual clues). Experimental results show that when the target domain looks visually different from the source domain, OPA can achieve better transfer performance even with much fewer samples from the target domain, outperforming prior methods.

online prototype alignment, source domain, target domain, (14 more...)

2306.07307

Country:

North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.50)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.46)