Goto

Collaborating Authors

 Machine Translation


What Are The Risks And Benefits Of Artificial Intelligence?

#artificialintelligence

What are the risks and benefits of artificial intelligence? It's a complicated topic, but I'll try to unpack a few key points here. Let's start with a quick definition: AI is the simulation of human intelligence by machines. Example of AI systems used regularly in developed countries include Amazon's Alexa, smart replies in Gmail, Chatbots, predictive searches in Google, and recommendations. At a baseline level, AI helps improve our everyday lives by solving pain points, streamlining processes, and advancing human knowledge.


What Are The Risks And Benefits Of Artificial Intelligence?

#artificialintelligence

What are the risks and benefits of artificial intelligence? It's a complicated topic, but I'll try to unpack a few key points here. Let's start with a quick definition: AI is the simulation of human intelligence by machines. Example of AI systems used regularly in developed countries include Amazon's Alexa, smart replies in Gmail, Chatbots, predictive searches in Google, and recommendations. At a baseline level, AI helps improve our everyday lives by solving pain points, streamlining processes, and advancing human knowledge.


Machine Learning for Clinical Predictive Analytics

arXiv.org Machine Learning

In this chapter, we provide a brief overview of applying machine learning techniques for clinical prediction tasks. We begin with a quick introduction to the concepts of machine learning and outline some of the most common machine learning algorithms. Next, we demonstrate how to apply the algorithms with appropriate toolkits to conduct machine learning experiments for clinical prediction tasks. The objectives of this chapter are to (1) understand the basics of machine learning techniques and the reasons behind why they are useful for solving clinical prediction problems, (2) understand the intuition behind some machine learning models, including regression, decision trees, and support vector machines, and (3) understand how to apply these models to clinical prediction problems using publicly available datasets via case studies.


Global Autoregressive Models for Data-Efficient Sequence Learning

arXiv.org Artificial Intelligence

Standard autoregressive seq2seq models are easily trained by max-likelihood, but tend to show poor results under small-data conditions. We introduce a class of seq2seq models, GAMs (Global Autoregressive Models), which combine an autoregressive component with a log-linear component, allowing the use of global \textit{a priori} features to compensate for lack of data. We train these models in two steps. In the first step, we obtain an \emph{unnormalized} GAM that maximizes the likelihood of the data, but is improper for fast inference or evaluation. In the second step, we use this GAM to train (by distillation) a second autoregressive model that approximates the \emph{normalized} distribution associated with the GAM, and can be used for fast inference and evaluation. Our experiments focus on language modelling under synthetic conditions and show a strong perplexity reduction of using the second autoregressive model over the standard one.


Fine-Tuning Language Models from Human Preferences

arXiv.org Machine Learning

Reward learning enables the application of reinforcement learning (RL) to tasks where reward is defined by human judgment, building a model of reward by asking humans questions. Most work on reward learning has used simulated environments, but complex information about values is often expressed in natural language, and we believe reward learning for language is a key to making RL practical and safe for real-world tasks. In this paper, we build on advances in generative pretraining of language models to apply reward learning to four natural language tasks: continuing text with positive sentiment or physically descriptive language, and summarization tasks on the TL;DR and CNN/Daily Mail datasets. For stylistic continuation we achieve good results with only 5,000 comparisons evaluated by humans. For summarization, models trained with 60,000 comparisons copy whole sentences from the input but skip irrelevant preamble; this leads to reasonable ROUGE scores and very good performance according to our human labelers, but may be exploiting the fact that labelers rely on simple heuristics.


Memory-Augmented Neural Networks for Machine Translation

arXiv.org Machine Learning

Memory-augmented neural networks (MANNs) have been shown to outperform other recurrent neural network architectures on a series of artificial sequence learning tasks, yet they have had limited application to real-world tasks. We evaluate direct application of Neural Turing Machines (NTM) and Differentiable Neural Computers (DNC) to machine translation. We further propose and evaluate two models which extend the attentional encoder-decoder with capabilities inspired by memory augmented neural networks. We evaluate our proposed models on IWSLT Vietnamese to English and ACL Romanian to English datasets. Our proposed models and the memory augmented neural networks perform similarly to the attentional encoder-decoder on the Vietnamese to English translation task while have a 0.3-1.9 lower BLEU score for the Romanian to English task. Interestingly, our analysis shows that despite being equipped with additional flexibility and being randomly initialized memory augmented neural networks learn an algorithm for machine translation almost identical to the attentional encoder-decoder.


r/MachineLearning - [Project] Multilingual Neural Machine Translation using Transformers with Conditional Normalization.

#artificialintelligence

The goal here is similar, make the rest of the network learn a common representation, while making the normalization parameters learn language specific semantics. The One-to-Many and Many-to-One models are trained for English to French, German, Italian and Spanish Translation and Vice Versa. The Many to Many model is trained on English-French, French-English, English-German and German-English. The image stylization paper specifies how a N-style network can pick up an N 1th style through fine-tuning an existing model. Similarly, I fine-tune my Many-to-Many model to pick up Portuguese.


Hint-Based Training for Non-Autoregressive Machine Translation

arXiv.org Machine Learning

Due to the unparallelizable nature of the autoregressive factorization, AutoRegressive Translation (ART) models have to generate tokens sequentially during decoding and thus suffer from high inference latency. Non-AutoRegressive Translation (NART) models were proposed to reduce the inference time, but could only achieve inferior translation accuracy. In this paper, we proposed a novel approach to leveraging the hints from hidden states and word alignments to help the training of NART models. The results achieve significant improvement over previous NART models for the WMT14 En-De and De-En datasets and are even comparable to a strong LSTM-based ART baseline but one order of magnitude faster in inference.


Adaptive Scheduling for Multi-Task Learning

arXiv.org Machine Learning

To train neural machine translation models simultaneously on multiple tasks (languages), it is common to sample each task uniformly or in proportion to dataset sizes. As these methods offer little control over performance trade-offs, we explore different task scheduling approaches. We first consider existing non-adaptive techniques, then move on to adaptive schedules that over-sample tasks with poorer results compared to their respective baseline. As explicit schedules can be inefficient, especially if one task is highly over-sampled, we also consider implicit schedules, learning to scale learning rates or gradients of individual tasks instead. These techniques allow training multilingual models that perform better for low-resource language pairs (tasks with small amount of data), while minimizing negative effects on high-resource tasks.


Entity Projection via Machine Translation for Cross-Lingual NER

arXiv.org Artificial Intelligence

Although over 100 languages are supported by strong off-the-shelf machine translation systems, only a subset of them possess large annotated corpora for named entity recognition. Motivated by this fact, we leverage machine translation to improve annotation-projection approaches to cross-lingual named entity recognition. We propose a system that improves over prior entity-projection methods by: (a) leveraging machine translation systems twice: first for translating sentences and subsequently for translating entities; (b) matching entities based on orthographic and phonetic similarity; and (c) identifying matches based on distributional statistics derived from the dataset. Our approach improves upon current state-of-the-art methods for cross-lingual named entity recognition on 5 diverse languages by an average of 4.1 points. Further, our method achieves state-of-the-art F_1 scores for Armenian, outperforming even a monolingual model trained on Armenian source data.