Goto

Collaborating Authors

 Machine Translation


BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

arXiv.org Machine Learning

BART is trained by (1) corrupting text with an arbitrary noising function, and (2) learning a model to reconstruct the original text. It uses a standard Tranformer-based neural machine translation architecture which, despite its simplicity, can be seen as generalizing BERT (due to the bidirectional encoder), GPT (with the left-to-right decoder), and many other more recent pretraining schemes. We evaluate a number of noising approaches, finding the best performance by both randomly shuffling the order of the original sentences and using a novel in-filling scheme, where spans of text are replaced with a single mask token. BART is particularly effective when fine tuned for text generation but also works well for comprehension tasks. It matches the performance of RoBERTa with comparable training resources on GLUE and SQuAD, achieves new state-of-the-art results on a range of abstractive dialogue, question answering, and summarization tasks, with gains of up to 6 ROUGE. BART also provides a 1.1 BLEU increase over a back-translation system for machine translation, with only target language pretraining. We also report ablation experiments that replicate other pretraining schemes within the BART framework, to better measure which factors most influence end-task performance.


The Automated Copywriter: Algorithmic Rephrasing of Health-Related Advertisements to Improve their Performance

arXiv.org Artificial Intelligence

Search advertising is one of the most commonly-used methods of advertising. Past work has shown that search advertising can be employed to improve health by eliciting positive behavioral change. However, writing effective advertisements requires expertise and (possible expensive) experimentation, both of which may not be available to public health authorities wishing to elicit such behavioral changes, especially when dealing with a public health crises such as epidemic outbreaks. Here we develop an algorithm which builds on past advertising data to train a sequence-to-sequence Deep Neural Network which "translates" advertisements into optimized ads that are more likely to be clicked. The network is trained using more than 114 thousands ads shown on Microsoft Advertising. We apply this translator to two health related domains: Medical Symptoms (MS) and Preventative Healthcare (PH) and measure the improvements in click-through rates (CTR). Our experiments show that the generated ads are predicted to have higher CTR in 81% of MS ads and 76% of PH ads. To understand the differences between the generated ads and the original ones we develop estimators for the affective attributes of the ads. We show that the generated ads contain more calls-to-action and that they reflect higher valence (36% increase) and higher arousal (87%) on a sample of 1000 ads. Finally, we run an advertising campaign where 10 random ads and their rephrased versions from each of the domains are run in parallel. We show an average improvement in CTR of 68% for the generated ads compared to the original ads. Our results demonstrate the ability to automatically optimize advertisement for the health domain. We believe that our work offers health authorities an improved ability to help nudge people towards healthier behaviors while saving the time and cost needed to optimize advertising campaigns.


Firefox will soon be able to translate web pages live (and no, it won't use Google)

#artificialintelligence

Firefox will soon be able to translate web pages into other languages – and will do so without using any third-party cloud-based services such as Google Translate or Bing Translator. Instead, the translation will happen entirely on your own device, which is in keeping with Mozilla's stated aim to let users keep control of their data (in this case, their identity and the content of the web pages they're viewing), and will keep costs down as there's no need for external processing. As ZDNet reports, this will be made possible by a translation library being developed as part of The Bergamot Project, which is dedicated to developing and improving client-side translation using machine learning. The Bergamot Project received a grant of €3 million (about $3.3 million / £2.6 million / AU$4.9 million) from the EU earlier this year to increase the uptake of language technologies in situations where confidentiality is essential. Mozilla has considered adding translation to Firefox before, but scrapped the idea due to the costs involved.


Why Hasn't AI Mastered Language Translation? - Liwaiwai

#artificialintelligence

Their creator observed, "And now nothing will be restrained from them, which they have imagined to do." According to the myth, God thwarted this effort by creating diverse languages so that they could no longer collaborate. Language remains a barrier in business and marketing. Even though technological devices can quickly and easily connect, humans from different parts of the world often can't. Translation agencies step in, making presentations, contracts, outsourcing instructions, and advertisements comprehensible to all intended recipients.


Fast Structured Decoding for Sequence Models

arXiv.org Machine Learning

Autoregressive sequence models achieve state-of-the-art performance in domains like machine translation. However, due to the autoregressive factorization nature, these models suffer from heavy latency during inference. Recently, non-autoregressive sequence models were proposed to speed up the inference time. However, these models assume that the decoding process of each token is conditionally independent of others. Such a generation process sometimes makes the output sentence inconsistent, and thus the learned non-autoregressive models could only achieve inferior accuracy compared to their autoregressive counterparts. To improve then decoding consistency and reduce the inference cost at the same time, we propose to incorporate a structured inference module into the non-autoregressive models. Specifically, we design an efficient approximation for Conditional Random Fields (CRF) for non-autoregressive sequence models, and further propose a dynamic transition technique to model positional contexts in the CRF. Experiments in machine translation show that while increasing little latency (8~14ms), our model could achieve significantly better translation performance than previous non-autoregressive models on different translation datasets. In particular, for the WMT14 En-De dataset, our model obtains a BLEU score of 26.80, which largely outperforms the previous non-autoregressive baselines and is only 0.61 lower in BLEU than purely autoregressive models.


Indic Language Computing

Communications of the ACM

In April 2019, following the Easter Sunday bomb attacks, the Government of Sri Lanka had to shut down Facebook and YouTube for nine days to stop the spreading of hate speech and false news, posted mainly in the local languages Sinhala and Tamil. This came about simply because these social media platforms did not have the capability to detect and warn about the provocative content. India's Ministry of Human Resource Development (MHRD) wants lectures on Swayama and NPTELb--the online teaching platforms--to be translated into all Indian languages. Approximately 2.5 million students use the Swayam lectures on computer science alone. The lectures are in English, which students find difficult to understand.


Artificial intelligence: is it a double-edged sword?

#artificialintelligence

Artificial Intelligence (AI) is already reconfiguring the world in conspicuous ways. Data drives our global digital ecosystem, and AI technologies reveal patterns in data. Smartphones, smart homes, and smart cities influence how we live and interact, and AI systems are increasingly involved in recruitment decisions, medical diagnoses and judicial verdicts. Whether this scenario is utopian or dystopian depends on your perspective. The potential risks of AI are enumerated repeatedly.


Diversifying Topic-Coherent Response Generation for Natural Multi-turn Conversations

arXiv.org Artificial Intelligence

Although response generation (RG) diversification for single-turn dialogs has been well developed, it is less investigated for natural multi-turn conversations. Besides, past work focused on diversifying responses without considering topic coherence to the context, producing uninformative replies. In this paper, we propose the Topic-coherent Hierarchical Recurrent Encoder-Decoder model (THRED) to diversify the generated responses without deviating the contextual topics for multi-turn conversations. In overall, we build a sequence-to-sequence net (Seq2Seq) to model multi-turn conversations. And then we resort to the latent Variable Hierarchical Recurrent Encoder-Decoder model (VHRED) to learn global contextual distribution of dialogs. Besides, we construct a dense topic matrix which implies word-level correlations of the conversation corpora. The topic matrix is used to learn local topic distribution of the contextual utterances. By incorporating both the global contextual distribution and the local topic distribution, THRED produces both diversified and topic-coherent replies. In addition, we propose an explicit metric (\emph{TopicDiv}) to measure the topic divergence between the post and generated response, and we also propose an overall metric combining the diversification metric (\emph{Distinct}) and \emph{TopicDiv}. We evaluate our model comparing with three baselines (Seq2Seq, HRED and VHRED) on two real-world corpora, respectively, and demonstrate its outstanding performance in both diversification and topic coherence.


IBM Brings AI Retrosynthetic Analysis to the Cloud IBM Research Blog

#artificialintelligence

The future of computing is one of the strongest transformational forces on our planet. Everything we touch has built-in computing capabilities and is generating tremendous volumes of data. The impact is not only speeding up our daily lives, but also more traditional industrial sectors, including chemistry. Last year at the ACS Fall Meeting 2018 in Boston, IBM Research released IBM RXN for Chemistry, a cloud-based app that takes the idea of relating organic chemistry to a language. The magic behind the app is a state-of-the-art neural machine translation method, which can predict the most likely outcome of a chemical reaction using sequence-to-sequence (seq2seq) models.


What is Image Annotation? – An Intro to 5 Image Annotation Services

#artificialintelligence

Image annotation is one of the most important tasks in computer vision. With numerous applications, computer vision essentially strives to give a machine eyes – the ability to see and interpret the world. At times, machine learning projects seem to unlock futuristic technology we never thought possible. AI-powered applications like augmented reality, automatic speech recognition, and neural machine translation have the potential to change lives and businesses around the world. Likewise, the technologies that computer vision can give us (autonomous vehicles, facial recognition, unmanned drones) are extraordinary.