Goto

Collaborating Authors

 Machine Translation


Shaping the Narrative Arc: An Information-Theoretic Approach to Collaborative Dialogue

arXiv.org Artificial Intelligence

We consider the problem of designing an artificial agent capable of interacting with humans in collaborative dialogue to produce creative, engaging narratives. In this task, the goal is to establish universe details, and to collaborate on an interesting story in that universe, through a series of natural dialogue exchanges. Our model can augment any probabilistic conversational agent by allowing it to reason about universe information established and what potential next utterances might reveal. Ideally, with each utterance, agents would reveal just enough information to add specificity and reduce ambiguity without limiting the conversation. We empirically show that our model allows control over the rate at which the agent reveals information and that doing so significantly improves accuracy in predicting the next line of dialogues from movies. We close with a case-study with four professional theatre performers, who preferred interactions with our model-augmented agent over an unaugmented agent.


Learning Efficient Lexically-Constrained Neural Machine Translation with External Memory

arXiv.org Artificial Intelligence

Recent years has witnessed dramatic progress of neural machine translation (NMT), however, the method of manually guiding the translation procedure remains to be better explored. Previous works proposed to handle such problem through lexcially-constrained beam search in the decoding phase. Unfortunately, these lexically-constrained beam search methods suffer two fatal disadvantages: high computational complexity and hard beam search which generates unexpected translations. In this paper, we propose to learn the ability of lexically-constrained translation with external memory, which can overcome the above mentioned disadvantages. For the training process, automatically extracted phrase pairs are extracted from alignment and sentence parsing, then further be encoded into an external memory. This memory is then used to provide lexically-constrained information for training through a memory-attention machanism. Various experiments are conducted on WMT Chinese to English and English to German tasks. All the results can demonstrate the effectiveness of our method.


Unsupervised Text Style Transfer via Iterative Matching and Translation

arXiv.org Artificial Intelligence

Text style transfer seeks to learn how to automatically rewrite sentences from a source domain to the target domain in different styles, while simultaneously preserving their semantic contents. A major challenge in this task stems from the lack of parallel data that connects the source and target styles. Existing approaches try to disentangle content and style, but this is quite difficult and often results in poor content-preservation and grammaticality. In contrast, we propose a novel approach by first constructing a pseudo-parallel resource that aligns a subset of sentences with similar content between source and target corpus. And then a standard sequence-to-sequence model can be applied to learn the style transfer. Subsequently, we iteratively refine the learned style transfer function while improving upon the imperfections in our original alignment. Our method is applied to the tasks of sentiment modification and formality transfer, where it outperforms state-of-the-art systems by a large margin. As an auxiliary contribution, we produced a publicly-available test set with human-generated style transfers for future community use.


14 NLP Research Breakthroughs You Can Apply To Your Business

#artificialintelligence

Language understanding is a challenge for computers. Subtle nuances of communication that human toddlers can understand still confuse the most powerful machines. Even though advanced techniques like deep learning can detect and replicate complex language patterns, machine learning models still lack fundamental conceptual understanding of what our words really mean. That said, 2018 did yield a number of landmark research breakthroughs which pushed the fields of natural language processing, understanding, and generation forward. We summarized 14 research papers covering several advances in natural language processing (NLP), including high-performing transfer learning techniques, more sophisticated language models, and newer approaches to content understanding. There are hundreds more papers in NLP, NLU, and NLG which we have not covered in this summary, but we hope for this article to give you a solid foundational understanding of the key papers of 2018.


Memory-Efficient Adaptive Optimization for Large-Scale Learning

arXiv.org Machine Learning

Adaptive gradient-based optimizers such as AdaGrad and Adam are among the methods of choice in modern machine learning. These methods maintain second-order statistics of each parameter, thus doubling the memory footprint of the optimizer. In behemoth-size applications, this memory overhead restricts the size of the model being used as well as the number of examples in a mini-batch. We describe a novel, simple, and flexible adaptive optimization method with sublinear memory cost that retains the benefits of per-parameter adaptivity while allowing for larger models and mini-batches. We give convergence guarantees for our method and demonstrate its effectiveness in training very large deep models.


Doubly Sparse: Sparse Mixture of Sparse Experts for Efficient Softmax Inference

arXiv.org Machine Learning

Computations for the softmax function are significantly expensive when the number of output classes is large. In this paper, we present a novel softmax inference speedup method, Doubly Sparse Softmax (DS-Softmax), that leverages sparse mixture of sparse experts to efficiently retrieve top-k classes. Different from most existing methods that require and approximate a fixed softmax, our method is learning-based and can adapt softmax weights for a better approximation. In particular, our method learns a two-level hierarchy which divides entire output class space into several partially overlapping experts. Each expert is sparse and only contains a subset of output classes. To find top-k classes, a sparse mixture enables us to find the most probable expert quickly, and the sparse expert enables us to search within a small-scale softmax. We empirically conduct evaluation on several real-world tasks (including neural machine translation, language modeling and image classification) and demonstrate that significant computation reductions can be achieved without loss of performance.


Why Quality Estimation Is The Missing Link For Machine Translation Adoption

#artificialintelligence

While there have been several key developments in machine translation (MT) in recent years, MT has not yet reached the level where businesses might be confident to allow it to proceed unchecked by humans. There is a paradox insofar that we want to allow artificial intelligence (AI) and automation to take on more and more tasks to relieve pressure on the human workforce, but, in turn, this creates more work for humans in terms of supervising their digital colleagues. We need look no further than the restaurant in China called "Translate Server Error" or Hillary Clinton's gift to the Russian foreign minister that was inscribed with a message that was supposed to say "reset" in Russian but actually showed the word "overcharge." AI still commits fundamental errors that are embarrassing at best, and at worst, they can convey offensive and/or completely unintended meanings. This is where the importance of quality estimation comes to the fore. A good definition of quality estimation comes from eBay, an enthusiastic user of QE: "A method used to automatically provide a quality indication for machine translation output without depending on human reference translations.


Embracing the Future of Content with Linguistic AI

#artificialintelligence

Late last year we hosted SDL's Japan Customer Summit, a one-day event attended by close to 50 of Japan's leading companies from all industries including retail, life sciences, automotive and finance. While the event itself has passed, we wanted to reflect on some of the key highlights that came out of the event and what we heard from a number of industry specialists. The event itself played host to a raft of experts from across SDL, exploring the latest developments in AI and Machine Learning (click here to read part 1 of this blog). This blog looks at the second half of the day, which covered practical use cases and scenarios where we showed how the latest technological developments can offer the greatest impact on a business. Mihai Vlad, VP of Machine Learning, explored the world of Machine Learning, more specifically how the accuracy of Machine Translation affects the ROI of content.


He Said, She Said: Addressing Gender in Neural Machine Translation Slator

#artificialintelligence

Artificial intelligence technology has run into a potentially delicate issue: gender bias. In November 2018, mainstream news media reported that Google's automatic suggestion tool for Google Mail will not suggest gender-based pronouns to avoid autocompleting a sentence with the wrong gender. The feature (called Smart Compose) will avoid suggesting genders because, as Gmail Product Manager Paul Lambert put it, "not all'screw-ups' are equal…[gender is] a big, big thing." Google Translate, which now largely runs on neural machine translation (NMT), had also recently addressed the question of gender bias. On December 6, 2018, Google published a first blog post about its efforts to reduce gender bias in Google Translate.


The Seven Trends in Machine Translation for 2019

#artificialintelligence

Maxim Khalilov is a director of applied artificial intelligence at Unbabel leading a team of AI engineers to apply AI technologies to meet the needs of the Unbabel business. Prior to his current role, he was a product owner in data science at Booking.com responsible for exploitation, collection and exploitation of digital content for hospitality market, a CTO at an innovative language service provider bmmt GmbH and an R&D manager at TAUS, a resource center for the global language industries. Maxim is also a co-founder of an Natural Language Processing company NLPPeople.com,