Goto

Collaborating Authors

 Machine Translation


Latent Translation: Crossing Modalities by Bridging Generative Models

arXiv.org Machine Learning

End-to-end optimization has achieved state-of-the-art performance on many specific problems, but there is no straight-forward way to combine pretrained models for new problems. Here, we explore improving modularity by learning a post-hoc interface between two existing models to solve a new task. Specifically, we take inspiration from neural machine translation, and cast the challenging problem of cross-modal domain transfer as unsupervised translation between the latent spaces of pretrained deep generative models. By abstracting away the data representation, we demonstrate that it is possible to transfer across different modalities (e.g., image-to-audio) and even different types of generative models (e.g., VAE-to-GAN). We compare to state-of-the-art techniques and find that a straight-forward variational autoencoder is able to best bridge the two generative models through learning a shared latent space. We can further impose supervised alignment of attributes in both domains with a classifier in the shared latent space. Through qualitative and quantitative evaluations, we demonstrate that locality and semantic alignment are preserved through the transfer process, as indicated by high transfer accuracies and smooth interpolations within a class. Finally, we show this modular structure speeds up training of new interface models by several orders of magnitude by decoupling it from expensive retraining of base generative models.


Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling

arXiv.org Machine Learning

Lingvo is a Tensorflow framework offering a complete solution for collaborative deep learning research, with a particular focus towards sequence-to-sequence models. Lingvo models are composed of modular building blocks that are flexible and easily extensible, and experiment configurations are centralized and highly customizable. Distributed training and quantized inference are supported directly within the framework, and it contains existing implementations of a large number of utilities, helper functions, and the newest research ideas. Lingvo has been used in collaboration by dozens of researchers in more than 20 papers over the last two years. This document outlines the underlying design of Lingvo and serves as an introduction to the various pieces of the framework, while also offering examples of advanced features that showcase the capabilities of the framework.


What Microsoft and Google Are Not Telling You About Their A.I.

#artificialintelligence

In September of 2018, iFlytek, a Chinese technology company and world leader in A.I. -- particularly in voice recognition software -- was accused of disguising human translation as machine translation during a tech conference in Shanghai. The whistleblower was an interpreter, Bell Wang, who was doing live translation at the conference. He noticed that iFlytek was using his translations as live subtitles on a screen next to the company's brand logo. This gave the appearance that the translated output was produced by their A.I. system, rather than by Wang. The company was also broadcasting the translations live online using a computer-synthesized voice, instead of the original human interpreters' voices.


The future of content is autonomous London Business News Londonlovesbusiness.com

#artificialintelligence

SDL a global leader in content creation, translation and delivery, today calls on brands to rethink current content strategies, and prepare for a digital future where content supply chains are autonomous, machine-first and human optimized, for greater impact with worldwide audiences, across any language and device. Companies are struggling to handle the growing volume and velocity of content required to engage with global audiences. And it's expected to get worse: 93% say the content they produce will increase in the next two years. SDL's Enabling the Future of Content report addresses these challenges, offering insights on how companies can move towards an autonomous content supply chain of the future, capable of delivering any type of content to global audiences. Peggy Chen, CMO, SDL said, "Engaging with customers globally requires content, and lots of it.


Semantic Neural Machine Translation using AMR

arXiv.org Artificial Intelligence

It is intuitive that semantic representations can be useful for machine translation, mainly because they can help in enforcing meaning preservation and handling data sparsity (many sentences correspond to one meaning) of machine translation models. On the other hand, little work has been done on leveraging semantics for neural machine translation (NMT). In this work, we study the usefulness of AMR (short for abstract meaning representation) on NMT. Experiments on a standard English-to-German dataset show that incorporating AMR as additional knowledge can significantly improve a strong attention-based sequence-to-sequence neural translation model.


A spelling correction model for end-to-end speech recognition

arXiv.org Artificial Intelligence

Attention-based sequence-to-sequence models for speech recognition jointly train an acoustic model, language model (LM), and alignment mechanism using a single neural network and require only parallel audio-text pairs. Thus, the language model component of the end-to-end model is only trained on transcribed audio-text pairs, which leads to performance degradation especially on rare words. While there have been a variety of work that look at incorporating an external LM trained on text-only data into the end-to-end framework, none of them have taken into account the characteristic error distribution made by the model. In this paper, we propose a novel approach to utilizing text-only data, by training a spelling correction (SC) model to explicitly correct those errors. On the LibriSpeech dataset, we demonstrate that the proposed model results in an 18.6% relative improvement in WER over the baseline model when directly correcting top ASR hypothesis, and a 29.0% relative improvement when further rescoring an expanded n-best list using an external LM.


Amazon, Google, Microsoft Press Further into Customized Language Tech and Services Slator

#artificialintelligence

Companies such as Amazon, Google, Microsoft, and many others have rapidly expanded their machine learning offerings and now increasingly encroach on the heart of language services. Take Bridgeman Images, for example. Bridgeman is a "specialist in the distribution of fine art, cultural and historical media for reproduction" -- the Getty Images of the art world, if you will. According to an Amazon case study published on February 6, 2019, the company needed automated translation to localize into many languages at scale. They opted for Amazon Web Services' Amazon Translate to localize "570 million English characters into Italian, French, German, and Spanish" over the course of 15 days.


Is the era of artificial speech translation upon us?

The Guardian

Noise, Alex Waibel tells me, is one of the major challenges that artificial speech translation has to meet. A device may be able to recognise speech in a laboratory, or a meeting room, but will struggle to cope with the kind of background noise I can hear surrounding Professor Waibel as he speaks to me from Kyoto station. I'm struggling to follow him in English, on a scratchy line that reminds me we are nearly 10,000km apart – and that distance is still an obstacle to communication even if you're speaking the same language. We haven't reached the future yet. If we had, Waibel would have been able to speak in his native German and I would have been able to hear his words in English.


Google Translate is a manifestation of Wittgenstein's theory of language

#artificialintelligence

More than 60 years after philosopher Ludwig Wittgenstein's theories on language were published, the artificial intelligence behind Google Translate has provided a practical example of his hypotheses. Patrick Hebron, who works on machine learning in design at Adobe and studied philosophy with Wittgenstein expert Garry Hagberg for his bachelor's degree at Bard College, notes that the networks behind Google Translate are a very literal representation of Wittgenstein's work. Google employees have previously acknowledged that Wittgenstein's theories gave them a breakthrough in making their translation services more effective, but somehow, this key connection between philosophy of language and artificial intelligence has long gone under-celebrated and overlooked. The translation service relies on an algorithm created by Google employees called word2vec, which creates "vector representations" for words, which essentially means that each word is represented numerically. For the translations to work, programmers have to then create a "neural network," a form of machine learning, that's trained to understand how these words relate to each other.


Neural Machine Translation with Sequence to Sequence RNN - DATAVERSITY

#artificialintelligence

Click to learn more about author Rosaria Silipo. Automatic machine translation has been a popular subject for machine learning algorithms. After all, if machines can detect topics and understand texts, translation should be just the next step. Machine translation can be seen as a variation of natural language generation. In a previous project, we worked on the automatic generation of fairy tales (see "Once upon a Time … by LSTM Network").