AITopics

2210.0373

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceOct-7-2022

Improving End-to-End Text Image Translation From the Auxiliary Text Translation Task

Ma, Cong, Zhang, Yaping, Tu, Mei, Han, Xu, Wu, Linghui, Zhao, Yang, Zhou, Yu

End-to-end text image translation (TIT), which aims at translating the source language embedded in images to the target language, has attracted intensive attention in recent research. However, data sparsity limits the performance of end-to-end text image translation. Multi-task learning is a non-trivial way to alleviate this problem via exploring knowledge from complementary related tasks. In this paper, we propose a novel text translation enhanced text image translation, which trains the end-to-end model with text translation as an auxiliary task. By sharing model parameters and multi-task training, our model is able to take full advantage of easily-available large-scale text parallel corpus. Extensive experimental results show our proposed method outperforms existing end-to-end methods, and the joint multi-task learning with both text translation and recognition tasks achieves better results, proving translation and recognition auxiliary tasks are complementary.

machine learning, natural language, translation, (17 more...)

2210.03887

Country:

Asia > China > Beijing > Beijing (0.05)
North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > Canada > Quebec > Montreal (0.04)
(12 more...)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Language Models are Multilingual Chain-of-Thought Reasoners

Shi, Freda, Suzgun, Mirac, Freitag, Markus, Wang, Xuezhi, Srivats, Suraj, Vosoughi, Soroush, Chung, Hyung Won, Tay, Yi, Ruder, Sebastian, Zhou, Denny, Das, Dipanjan, Wei, Jason

We evaluate the reasoning abilities of large language models in multilingual settings. We introduce the Multilingual Grade School Math (MGSM) benchmark, by manually translating 250 grade-school math problems from the GSM8K dataset (Cobbe et al., 2021) into ten typologically diverse languages. We find that the ability to solve MGSM problems via chain-of-thought prompting emerges with increasing model scale, and that models have strikingly strong multilingual reasoning abilities, even in underrepresented languages such as Bengali and Swahili. Finally, we show that the multilingual reasoning abilities of language models extend to other tasks such as commonsense reasoning and word-in-context semantic judgment. The MGSM benchmark is publicly available at https://github.com/google-research/url-nlp.

computational linguistic, large language model, natural language, (18 more...)

2210.03057

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
South America > Peru > Cusco Department > Cusco Province > Cusco (0.04)
Europe > Italy > Tuscany > Florence (0.04)
(7 more...)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)

Wein, Shira, Wang, Zhuxin, Schneider, Nathan

Measuring Fine-Grained Semantic Equivalence with Abstract Meaning Representation

Identifying semantically equivalent sentences is important for many cross-lingual and mono-lingual NLP tasks. Current approaches to semantic equivalence take a loose, sentence-level approach to "equivalence," despite previous evidence that fine-grained differences and implicit content have an effect on human understanding (Roth and Anthonio, 2021) and system performance (Briakou and Carpuat, 2021). In this work, we introduce a novel, more sensitive method of characterizing semantic equivalence that leverages Abstract Meaning Representation graph structures. We develop an approach, which can be used with either gold or automatic AMR annotations, and demonstrate that our solution is in fact finer-grained than existing corpus filtering methods and more accurate at predicting strictly equivalent sentences than existing semantic similarity metrics. We suggest that our finer-grained measure of semantic equivalence could limit the workload in the task of human post-edited machine translation and in human evaluation of sentence similarity.

artificial intelligence, natural language, text processing, (17 more...)

2210.03018

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > Canada > Quebec > Montreal (0.05)
Europe > Bulgaria > Sofia City Province > Sofia (0.04)
(13 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.89)

Reinforcement Learning with Large Action Spaces for Neural Machine Translation

Yehudai, Asaf, Choshen, Leshem, Fox, Lior, Abend, Omri

Applying Reinforcement learning (RL) following maximum likelihood estimation (MLE) pre-training is a versatile method for enhancing neural machine translation (NMT) performance. However, recent work has argued that the gains produced by RL for NMT are mostly due to promoting tokens that have already received a fairly high probability in pre-training. We hypothesize that the large action space is a main obstacle to RL's effectiveness in MT, and conduct two sets of experiments that lend support to our hypothesis. First, we find that reducing the size of the vocabulary improves RL's effectiveness. Second, we find that effectively reducing the dimension of the action space without changing the vocabulary also yields notable improvement as evaluated by BLEU, semantic similarity, and human evaluation. Indeed, by initializing the network's final fully connected layer (that maps the network's internal dimension to the vocabulary dimension), with a layer that generalizes over similar actions, we obtain a substantial improvement in RL performance: 1.5 BLEU points on average.

machine learning, reinforcement learning, translation, (19 more...)

2210.03053

Country:

Europe > Germany > Berlin (0.04)
Europe > Switzerland > Geneva > Geneva (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Duality Regularization for Unsupervised Bilingual Lexicon Induction

Bai, Xuefeng, Zhang, Yue, Cao, Hailong, Zhao, Tiejun

Unsupervised bilingual lexicon induction naturally exhibits duality, which results from symmetry in back-translation. For example, EN-IT and IT-EN induction can be mutually primal and dual problems. Current state-of-the-art methods, however, consider the two tasks independently. In this paper, we propose to train primal and dual models jointly, using regularizers to encourage consistency in back translation cycles. Experiments across 6 language pairs show that the proposed method significantly outperforms competitive baselines, obtaining the best-published results on a standard benchmark.

artificial intelligence, machine learning, natural language, (15 more...)

1909.01013

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Belgium (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.47)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.31)

Truth Serum: Poisoning Machine Learning Models to Reveal Their Secrets

Tramèr, Florian, Shokri, Reza, Joaquin, Ayrton San, Le, Hoang, Jagielski, Matthew, Hong, Sanghyun, Carlini, Nicholas

We introduce a new class of attacks on machine learning models. We show that an adversary who can poison a training dataset can cause models trained on this dataset to leak significant private details of training points belonging to other parties. Our active inference attacks connect two independent lines of work targeting the integrity and privacy of machine learning training data. Our attacks are effective across membership inference, attribute inference, and data extraction. For example, our targeted attacks can poison <0.1% of the training dataset to boost the performance of inference attacks by 1 to 2 orders of magnitude. Further, an adversary who controls a significant fraction of the training data (e.g., 50%) can launch untargeted attacks that enable 8x more precise inference on all other users' otherwise-private data points. Our results cast doubts on the relevance of cryptographic privacy guarantees in multiparty computation protocols for machine learning, if parties can arbitrarily select their share of training data.

artificial intelligence, machine learning, natural language, (19 more...)

2204.00032

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.16)
North America > United States > Oregon (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(3 more...)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.75)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.67)

#artificialintelligenceOct-5-2022, 12:53:46 GMT

NLP Interview Questions - KDnuggets

NLP is not something all data scientists necessarily work with and are required to know. Whether or not you are, depends on the company interviewing you for a data science position. Well, you'll have to know what it is so you can avoid it in your career, if nothing else. In case you're intrigued by NLP and willing to learn more, you will benefit from knowing what interview questions you could expect. No, it's not that pseudoscientific psychological approach that gained popularity recently.

data scientist, knowledge, nlp, (11 more...)

#artificialintelligence

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.30)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.30)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.30)

Oncevay, Arturo, Rojas, Kervy Dante Rivas, Sanchez, Liz Karen Chavez, Zariquiey, Roberto

Revisiting Syllables in Language Modelling and their Application on Low-Resource Machine Translation

arXiv.org Artificial IntelligenceOct-5-2022

Language modelling and machine translation tasks mostly use subword or character inputs, but syllables are seldom used. Syllables provide shorter sequences than characters, require less-specialised extracting rules than morphemes, and their segmentation is not impacted by the corpus size. In this study, we first explore the potential of syllables for open-vocabulary language modelling in 21 languages. We use rule-based syllabification methods for six languages and address the rest with hyphenation, which works as a syllabification proxy. With a comparable perplexity, we show that syllables outperform characters and other subwords. Moreover, we study the importance of syllables on neural machine translation for a non-related and low-resource language-pair (Spanish--Shipibo-Konibo). In pairwise and multilingual systems, syllables outperform unsupervised subwords, and further morphological segmentation methods, when translating into a highly synthetic language with a transparent orthography (Shipibo-Konibo). Finally, we perform some human evaluation, and discuss limitations and opportunities.

artificial intelligence, machine translation, natural language, (15 more...)

2210.02509

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Denmark > Capital Region > Copenhagen (0.05)
Asia > Myanmar (0.04)
(20 more...)

Genre: Research Report > New Finding (0.88)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Ohta, Mayumi, Kreutzer, Julia, Riezler, Stefan

JoeyS2T: Minimalistic Speech-to-Text Modeling with JoeyNMT

arXiv.org Artificial IntelligenceOct-5-2022

JoeyS2T is a JoeyNMT extension for speech-to-text tasks such as automatic speech recognition and end-to-end speech translation. It inherits the core philosophy of JoeyNMT, a minimalist NMT toolkit built on PyTorch, seeking simplicity and accessibility. JoeyS2T's workflow is self-contained, starting from data pre-processing, over model training and prediction to evaluation, and is seamlessly integrated into JoeyNMT's compact and simple code base. On top of JoeyNMT's state-of-the-art Transformer-based encoder-decoder architecture, JoeyS2T provides speech-oriented components such as convolutional layers, SpecAugment, CTC-loss, and WER evaluation. Despite its simplicity compared to prior implementations, JoeyS2T performs competitively on English speech recognition and English-to-German speech translation benchmarks. The implementation is accompanied by a walk-through tutorial and available on https://github.com/may-/joeys2t.

computational linguistic, machine learning, natural language, (16 more...)

2210.02545

Country:

Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > Indonesia > Bali (0.04)
(13 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)