Goto

Collaborating Authors

 Machine Translation


Advancing AI Capabilities with Next-Generation HPC Solutions

#artificialintelligence

HPE and NVIDIA are delivering IT solutions with superhuman intelligence to harness the full power of AI and pioneer the next generation of HPC systems. In this evolving digital economy, data is the cornerstone of success. Big Data is redefining the way we think, act, and understand the world, and accelerating insight is the difference between making the next major discovery and missing it. The more information we can effectively capture, analyze, and act on, the more opportunities there are to drive technological advancements, ensure economic control, strengthen national security, and fuel scientific research. Organizations across all sectors are putting this data to work.


AI Programming: So Much Uncertainty - The New Stack

#artificialintelligence

Much work, and many tools, are still needed to integrate artificial intelligence into the software engineering workflow, noted Peter Norvig, Google's director of research, speaking at the O'Reilly Artificial Intelligence conference in New York last week. Fundamentally, AI software is inherently different from other forms of widely used software, said Norvig, who is also a co-author of perhaps the most popular book of programming instruction for the field, Artificial Intelligence: A Modern Approach. "One way of looking at the traditional model of programming is to look at the programmer is a micro-manager, who tells a computer exactly how to do something step by step," he said. With AI, we should look at the programmer more as a teacher, rather than a micro-manager. This will require big changes in how programming is done, and the tools needed to program easily.


Dual Supervised Learning

arXiv.org Machine Learning

Many supervised learning tasks are emerged in dual forms, e.g., English-to-French translation vs. French-to-English translation, speech recognition vs. text to speech, and image classification vs. image generation. Two dual tasks have intrinsic connections with each other due to the probabilistic correlation between their models. This connection is, however, not effectively utilized today, since people usually train the models of two dual tasks separately and independently. In this work, we propose training the models of two dual tasks simultaneously, and explicitly exploiting the probabilistic correlation between them to regularize the training process. For ease of reference, we call the proposed approach \emph{dual supervised learning}. We demonstrate that dual supervised learning can improve the practical performances of both tasks, for various applications including machine translation, image processing, and sentiment analysis.


Neural Sequence Model Training via $\alpha$-divergence Minimization

arXiv.org Machine Learning

We propose a new neural sequence model training method in which the objective function is defined by $\alpha$-divergence. We demonstrate that the objective function generalizes the maximum-likelihood (ML)-based and reinforcement learning (RL)-based objective functions as special cases (i.e., ML corresponds to $\alpha \to 0$ and RL to $\alpha \to1$). We also show that the gradient of the objective function can be considered a mixture of ML- and RL-based objective gradients. The experimental results of a machine translation task show that minimizing the objective function with $\alpha > 0$ outperforms $\alpha \to 0$, which corresponds to ML-based methods.


Report: AWS set to add machine-translation services for developers

#artificialintelligence

It sounds like Amazon Web Services is getting ready to bring translation technology used on the Amazon.com CNBC reported Monday that AWS will likely announce the availability of a machine translation service before the big re:Invent user conference this November. AWS has been first among cloud rivals many times in the past when it comes to releasing new services for its customers, one of the many reasons why it enjoys a leading portion of the market for cloud services. But it's playing catch-up here: Google has been working on computer-assisted translation for almost a decade as part of its search technology, and offers a translation API through Google Cloud Platform for its customers. Microsoft also has an API for Azure customers. But Amazon does have a lot of experience in developing natural language processing technology, as AWS VP Swami Sivasubramanian explained earlier this month at our Cloud Tech Summit.


Why AI-powered translation needs a lot of work

#artificialintelligence

The latest scare story around the rise of robots is that within 120 years all human jobs will be automated. If that study from Oxford University is to be believed, we're just 3 to 4 generations away from perpetual holiday. The report goes on to predict when AI will outperform humans and -- more interestingly -- how. Some aspects will be of genuine concern to certain industries: AI will be a better driver than human heavy goods vehicles drivers by 2027, AI will write better novels than we can by 2049, and, closest to today, AI will be better at translation by 2024. AI has the potential to significantly reshape the translation sector, as it's doing to many other industries already. However, given that the last time human translators were pitted against machine translation (in February) that 90 percent of the automated translation was judged "grammatically awkward," that is a bold prediction.


Artificial Intelligence Poised to Ride a New Wave

Communications of the ACM

Chinese professional Go player Ke Jie preparing to make a move during the second game of a match against Google's AlphaGo in May 2017. Artificial intelligence (AI), once described as a technology with permanent potential, has come of age in the past decade. Propelled by massively parallel computer systems, huge datasets, and better algorithms, AI has brought a number of important applications, such as image- and speech-recognition and autonomous vehicle navigation, to near-human levels of performance. Now, AI experts say, a wave of even newer technology may enable systems to understand and react to the world in ways that traditionally have been seen as the sole province of human beings. These technologies include algorithms that model human intuition and make predictions in the face of incomplete knowledge, systems that learn without being pre-trained with labeled data, systems that transfer knowledge gained in one domain to another, hybrid systems that combine two or more approaches, and more powerful and energy-efficient hardware specialized for AI.


Adversarial Neural Machine Translation

arXiv.org Machine Learning

In this paper, we study a new learning paradigm for Neural Machine Translation (NMT). Instead of maximizing the likelihood of the human translation as in previous works, we minimize the distinction between human translation and the translation given by an NMT model. To achieve this goal, inspired by the recent success of generative adversarial networks (GANs), we employ an adversarial training architecture and name it as Adversarial-NMT. In Adversarial-NMT, the training of the NMT model is assisted by an adversary, which is an elaborately designed Convolutional Neural Network (CNN). The goal of the adversary is to differentiate the translation result generated by the NMT model from that by human. The goal of the NMT model is to produce high quality translations so as to cheat the adversary. A policy gradient method is leveraged to co-train the NMT model and the adversary. Experimental results on English$\rightarrow$French and German$\rightarrow$English translation tasks show that Adversarial-NMT can achieve significantly better translation quality than several strong baselines.


PostDoc Position in the area of Neural Machine Translation

#artificialintelligence

The Institute of Formal and Applied Linguistics (UFAL) is seeking a candidate for a one-year post-doc position in the area of neural machine translation (NMT). The exact topic will be determined based on the candidate's interests, e.g. A PhD degree in computational linguistic, artificial intelligence or a related field is required. Experience with neural MT, Linux and cluster environment (SGE), and/or general deep learning and GPU computation is a bonus.


One Model To Learn Them All

arXiv.org Machine Learning

Deep learning yields great results across many fields, from speech recognition, image classification, to translation. But for each problem, getting a deep model to work well involves research into the architecture and a long period of tuning. We present a single model that yields good results on a number of problems spanning multiple domains. In particular, this single model is trained concurrently on ImageNet, multiple translation tasks, image captioning (COCO dataset), a speech recognition corpus, and an English parsing task. Our model architecture incorporates building blocks from multiple domains. It contains convolutional layers, an attention mechanism, and sparsely-gated layers. Each of these computational blocks is crucial for a subset of the tasks we train on. Interestingly, even if a block is not crucial for a task, we observe that adding it never hurts performance and in most cases improves it on all tasks. We also show that tasks with less data benefit largely from joint training with other tasks, while performance on large tasks degrades only slightly if at all.