Goto

Collaborating Authors

 Africa


Tag-less Back-Translation

arXiv.org Artificial Intelligence

An effective method to generate a large number of parallel sentences for training improved neural machine translation (NMT) systems is the use of back-translations of the target-side monolingual data. Tagging, or using gates, has been used to enable translation models to distinguish between synthetic and natural data. This improves standard back-translation and also enables the use of iterative back-translation on language pairs that underperformed using standard back-translation. This work presents a simplified approach of differentiating between the two data using pretraining and finetuning. The approach - tag-less back-translation - trains the model on the synthetic data and finetunes it on the natural data. Preliminary experiments have shown the approach to continuously outperform the tagging approach on low resource English-Vietnamese neural machine translation. While the need for tagging (noising) the dataset has been removed, the approach outperformed the tagged back-translation approach by an average of 0.4 BLEU.


Plug and Play Language Models: A Simple Approach to Controlled Text Generation

arXiv.org Artificial Intelligence

Large transformer-based language models (LMs) trained on huge text corpora have shown unparalleled generation capabilities. However, controlling attributes of the generated language (e.g. switching topic or sentiment) is difficult without modifying the model architecture or fine-tuning on attribute-specific data and entailing the significant cost of retraining. We propose a simple alternative: the Plug and Play Language Model (PPLM) for controllable language generation, which combines a pretrained LM with one or more simple attribute classifiers that guide text generation without any further training of the LM. In the canonical scenario we present, the attribute models are simple classifiers consisting of a user-specified bag of words or a single learned layer with 100,000 times fewer parameters than the LM. Sampling entails a forward and backward pass in which gradients from the attribute model push the LM's hidden activations and thus guide the generation. Model samples demonstrate control over a range of topics and sentiment styles, and extensive automated and human annotated evaluations show attribute alignment and fluency. PPLMs are flexible in that any combination of differentiable attribute models may be used to steer text generation, which will allow for diverse and creative applications beyond the examples given in this paper.


8 life lessons everyone should learn before 2020

#artificialintelligence

Anything you do online can come back to bite you. It's been a decade full of lessons: who to trust, when to speak out and how to stream big events online after you've broken up with your cable company. In 2010, the first iPhone was only three years old. Uber and Lyft didn't exist, and neither did Google Assistant and Siri, Instagram or streaming video. We've come a long way since then, but the next 10 years won't be easy.


ICLR 2020 Accepted Papers Announced

#artificialintelligence

The International Conference on Learning Representations ICLR 2020 is four months away but has already attracted more than its share of drama with a deluge of submissions and doubts about the qualifications of some reviewers. Yesterday the conference programme chairs finally put the selection process behind them, announcing 687 out of 2594 papers had made it to ICLR 2020 -- a 26.5 percent acceptance rate. ICLR 2020 will be held in Addis Ababa, Ethiopia from April 26 to 30. This will be the first trip to Africa for a major AI conference, a move long-encouraged by many leading AI researchers. All accepted papers will be presented as posters as usual, while 23 percent will have an oral presentation.


Is AI a fad?

#artificialintelligence

Every time some genius decides to apply AI where it doesn't belong, the world collectively rolls its eyes and puts another ballot in the AI-Is-A-Fad box. If your dictionary defines AI as magic or robots (or magical robots), of course you'll be disappointed when it doesn't deliver the cure to all that ails you. Let's look at three common gripes using simple examples everyone can grasp. A respectable software engineer once asked me with a straight face, "Can AI know that Canada is a country?" Hold your horses there, cowboy.


Regularized Operating Envelope with Interpretability and Implementability Constraints

arXiv.org Machine Learning

--Operating envelope is an important concept in industrial operations. Accurate identification for operating envelope can be extremely beneficial to stakeholders as it provides a set of operational parameters that optimizes some key performance indicators (KPI) such as product quality, operational safety, equipment efficiency, environmental impact, etc. Given the importance, data-driven approaches for computing the operating envelope are gaining popularity. These approaches typically use classifiers such as support vector machines, to set the operating envelope by learning the boundary in the operational parameter spaces between the manually assigned'large KPI' and'small KPI' groups. One challenge to these approaches is that the assignment to these groups is often ad-hoc and hence arbitrary. However, a bigger challenge with these approaches is that they don't take into account two key features that are needed to operationalize operating envelopes: (i) interpretability of the envelope by the operator and (ii) implementability of the envelope from a practical standpoint. In this work, we propose a new definition for operating envelope which directly targets the expected magnitude of KPI (i.e., no need to arbitrarily bin the data instances into groups) and accounts for the interpretability and the implementability. We then propose a regularized'GA penalty' algorithm that outputs an envelope where the user can tradeoff between bias and variance. The validity of our proposed algorithm is demonstrated by two sets of simulation studies and an application to a real-world challenge in the mining processes of a flotation plant. In industrial operations, an important concept is that of the operating envelope. Conceptually, the operating envelope is a set of operational parameters, such that some KPI is optimized. In the industrial context, typical KPIs include product quality, operational safety, equipment efficiency, environmental impact, etc [1]-[4]. The operating envelope has wide application since it directly targets the business outcome and yields actionable recommendations in the operations space.


Taxonomy and Evaluation of Structured Compression of Convolutional Neural Networks

arXiv.org Machine Learning

The success of deep neural networks in many real-world applications is leading to new challenges in building more efficient architectures. One effective way of making networks more efficient is neural network compression. We provide an overview of existing neural network compression methods that can be used to make neural networks more efficient by changing the architecture of the network. First, we introduce a new way to categorize all published compression methods, based on the amount of data and compute needed to make the methods work in practice. These are three 'levels of compression solutions'. Second, we provide a taxonomy of tensor factorization based and probabilistic compression methods. Finally, we perform an extensive evaluation of different compression techniques from the literature for models trained on ImageNet. We show that SVD and probabilistic compression or pruning methods are complementary and give the best results of all the considered methods. We also provide practical ways to combine them.


Landscape Connectivity and Dropout Stability of SGD Solutions for Over-parameterized Neural Networks

arXiv.org Machine Learning

The optimization of multilayer neural networks typically leads to a solution with zero training error, yet the landscape can exhibit spurious local minima and the minima can be disconnected. In this paper, we shed light on this phenomenon: we show that the combination of stochastic gradient descent (SGD) and over-parameterization makes the landscape of multilayer neural networks approximately connected and thus more favorable to optimization. More specifically, we prove that SGD solutions are connected via a piecewise linear path, and the increase in loss along this path vanishes as the number of neurons grows large. This result is a consequence of the fact that the parameters found by SGD are increasingly dropout stable as the network becomes wider. We show that, if we remove part of the neurons (and suitably rescale the remaining ones), the change in loss is independent of the total number of neurons, and it depends only on how many neurons are left. Our results exhibit a mild dependence on the input dimension: they are dimension-free for two-layer networks and depend linearly on the dimension for multilayer networks. We validate our theoretical findings with numerical experiments for different architectures and classification tasks.


EAST: Encoding-Aware Sparse Training for Deep Memory Compression of ConvNets

arXiv.org Machine Learning

The implementation of Deep Convolutional Neural Networks (ConvNets) on tiny end-nodes with limited non-volatile memory space calls for smart compression strategies capable of shrinking the footprint yet preserving predictive accuracy. There exist a number of strategies for this purpose, from those that play with the topology of the model or the arithmetic precision, e.g. pruning and quantization, to those that operate a model agnostic compression, e.g. weight encoding. The tighter the memory constraint, the higher the probability that these techniques alone cannot meet the requirement, hence more awareness and cooperation across different optimizations become mandatory. This work addresses the issue by introducing EAST, Encoding-Aware Sparse Training, a novel memory-constrained training procedure that leads quantized ConvNets towards deep memory compression. EAST implements an adaptive group pruning designed to maximize the compression rate of the weight encoding scheme (the LZ4 algorithm in this work). If compared to existing methods, EAST meets the memory constraint with lower sparsity, hence ensuring higher accuracy. Results conducted on a state-of-the-art ConvNet (ResNet-9) deployed on a low-power microcontroller (ARM Cortex-M4) validate the proposal.


France deploys armed drones in Sahel anti-jihadi fight

The Japan Times

PARIS – France has officially deployed its first armed drones, three American-built Reapers fitted with laser-guided missiles, in its fight against a jihadi insurrection in Africa's Sahel region, Defense Minister Florence Parly announced Thursday. The drones, which have already since 2014 provided surveillance support to the French anti-jihadi Barkhane mission in Mali, Niger and Burkina Faso, will from now on also be able to strike targets, she said. France joins a small club of countries, including the United States, Britain and Israel, that use armed, distance-piloted aircraft in combat. The Reapers will each carry two 250-kg (550-pound) laser-guided bombs, and are entering service after a series of operational tests carried out from the airbase in the Niger capital Niamey. "Their main missions remain surveillance and intelligence … but these can be extended to strikes," Parly said.