Machine Translation
What Did We Learn at the New Work Summit?
MR. METZ This is an ongoing problem. There have been very real and very significant gains in image recognition, speech recognition and language translation over the last several years. That can help with talking digital assistants, driverless cars and certain aspects of health care -- not to mention face recognition services and autonomous weapons. Driverless cars are still years from the mainstream. Better translation is very different from a more general intelligence that can do anything a human can do.
How machine learning can be used to break down language barriers
Machine learning has transformed major aspects of the modern world with great success. Self-driving cars, intelligent virtual assistants on smartphones, and cybersecurity automation are all examples of how far the technology has come. But of all the applications of machine learning, few have the potential to so radically shape our economy as language translation. The content of language translation is the perfect model for machine learning to tackle. Language operates on a set of predictable rules, but with a degree of variation that makes it difficult for humans to interpret.
Calibration of Encoder Decoder Models for Neural Machine Translation
Kumar, Aviral, Sarawagi, Sunita
We study the calibration of several state of the art neural machine translation(NMT) systems built on attention-based encoder-decoder models. For structured outputs like in NMT, calibration is important not just for reliable confidence with predictions, but also for proper functioning of beam-search inference. We show that most modern NMT models are surprisingly miscalibrated even when conditioned on the true previous tokens. Our investigation leads to two main reasons -- severe miscalibration of EOS (end of sequence marker) and suppression of attention uncertainty. We design recalibration methods based on these signals and demonstrate improved accuracy, better sequence-level calibration, and more intuitive results from beam-search.
Jointly Optimizing Diversity and Relevance in Neural Response Generation
Gao, Xiang, Lee, Sungjin, Zhang, Yizhe, Brockett, Chris, Galley, Michel, Gao, Jianfeng, Dolan, Bill
Although recent neural conversation models have shown great potential, they often generate bland and generic responses. While various approaches have been explored to diversify the output of the conversation model, the improvement often comes at the cost of decreased relevance. In this paper, we propose a method to jointly optimize diversity and relevance that essentially fuses the latent space of a sequence-to-sequence model and that of an autoencoder model by leveraging novel regularization terms. As a result, our approach induces a latent space in which the distance and direction from the predicted response vector roughly match the relevance and diversity, respectively. This property also lends itself well to an intuitive visualization of the latent space. Both automatic and human evaluation results demonstrate that the proposed approach brings significant improvement compared to strong baselines in both diversity and relevance.
Non-Parametric Adaptation for Neural Machine Translation
Neural Networks trained with gradient descent are known to be susceptible to catastrophic forgetting caused by parameter shift during the training process. In the context of Neural Machine Translation (NMT) this results in poor performance on heterogeneous datasets and on sub-tasks like rare phrase translation. On the other hand, non-parametric approaches are immune to forgetting, perfectly complementing the generalization ability of NMT. However, attempts to combine non-parametric or retrieval based approaches with NMT have only been successful on narrow domains, possibly due to over-reliance on sentence level retrieval. We propose a novel n-gram level retrieval approach that relies on local phrase level similarities, allowing us to retrieve neighbors that are useful for translation even when overall sentence similarity is low. We complement this with an expressive neural network, allowing our model to extract information from the noisy retrieved context. We evaluate our semi-parametric NMT approach on a heterogeneous dataset composed of WMT, IWSLT, JRC-Acquis and OpenSubtitles, and demonstrate gains on all 4 evaluation sets. The semi-parametric nature of our approach opens the door for non-parametric domain adaptation, demonstrating strong inference-time adaptation performance on new domains without the need for any parameter updates.
Neuroscience-Inspired Artificial Intelligence
Learning to combine foveal glimpses with a third-order Boltzmann machine. Multiple object recognition with visual attention. Show, attend and tell: neural image caption generation with visual attention. Neural machine translation by jointly learning to align and translate. Learning what and where to draw.
State-Of-The-Art Methods For Neural Machine Translation & Multilingual Tasks
The quality of machine translation produced by state-of-the-art models is already quite high and often requires only minor corrections from professional human translators. This is especially true for high-resource language pairs like English-German and English-French. So, the main focus of recent research studies in machine translation was on improving system performance for low-resource language pairs, where we have access to large monolingual corpora in each language but do not have sufficiently large parallel corpora. Facebook AI researchers seem to lead in this research area and have introduced several interesting solutions for low-resource machine translation during the last year. This includes augmenting the training data with back-translation, learning joint multilingual sentence representations, as well as extending BERT to a cross-lingual setting.
Gong.io Transforming CRM Solutions with AI
News that Gong.io has acquired $40M to add their existing funding shows they are well on their way to transforming CRM systems with the use of artificial intelligence. Gong.io is relatively new on the scene but is already making big moves in the CRM industry. Their AI-assisted Conversation Intelligence and account analytics platform removes the conjectures and unknowns to assist sales and marketing teams to formulate winning sales strategies. Founded in 2015 by Amit Bendov, CEO, and Eilon Reshef, CTO, the San Francisco and Israel based company has two very successful people at the helm. Both founders have impressive resumes with proven track records in the growing, selling, and IPOing of startups.
Dr. Technophile or: How Localizers Learned to Stop Worrying and Love AI
The future of the language industry is bright. In a world where globalization brings us closer together, advances in technology make it easier than ever to communicate and conduct our work efficiently. The primary purpose of a machine is to facilitate a specific task; so, the question remains, why do so many of us fear the rise of artificial intelligence (AI)? Admittedly, the notion of a machine learning to navigate an area so intimately human as language is disquieting. Where do humans fit in an industry that is so eager to introduce machine learning technologies?
Non-Autoregressive Machine Translation with Auxiliary Regularization
Wang, Yiren, Tian, Fei, He, Di, Qin, Tao, Zhai, ChengXiang, Liu, Tie-Yan
As a new neural machine translation approach, Non-Autoregressive machine Translation (NAT) has attracted attention recently due to its high efficiency in inference. However, the high efficiency has come at the cost of not capturing the sequential dependency on the target side of translation, which causes NAT to suffer from two kinds of translation errors: 1) repeated translations (due to indistinguishable adjacent decoder hidden states), and 2) incomplete translations (due to incomplete transfer of source side information via the decoder hidden states). In this paper, we propose to address these two problems by improving the quality of decoder hidden representations via two auxiliary regularization terms in the training process of an NAT model. First, to make the hidden states more distinguishable, we regularize the similarity between consecutive hidden states based on the corresponding target tokens. Second, to force the hidden states to contain all the information in the source sentence, we leverage the dual nature of translation tasks (e.g., English to German and German to English) and minimize a backward reconstruction error to ensure that the hidden states of the NAT decoder are able to recover the source side sentence. Extensive experiments conducted on several benchmark datasets show that both regularization strategies are effective and can alleviate the issues of repeated translations and incomplete translations in NAT models. The accuracy of NAT models is therefore improved significantly over the state-of-the-art NAT models with even better efficiency for inference.