Machine Translation
Google Translate Vs. Papago: In Asia's Battle Of Translation Apps, Everyone's A Loser
Opinions expressed by Forbes Contributors are their own. The author is a Forbes contributor. The opinions expressed are those of the writer. A woman uses a translation app on her smartphone in Paris on Nov. 4, 2014. If you have ever traveled through a foreign country armed with little more than Google Translate to communicate, you know how awkward it can be to use the app to ask a friendly local for directions to the zoo only to unwittingly end up insulting his sister.
What is artificial intelligence? A three part definition ยท Simply Statistics
Editor's note: This is the first chapter of a book I'm working on called Demystifying Artificial Intelligence. The goal of the book is to demystify what modern AI is and does for a general audience. So something to smooth the transition between AI fiction and highly mathematical descriptions of deep learning. I'm developing the book over time - so if you buy the book on Leanpub know that there is only one chaper in there so far, but I'll be adding more over the next few weeks and you get free updates. The cover of the book was inspired by this amazing tweet by Twitter user @notajf. Feedback is welcome and encouraged!
Google's AI translation tool seems to have invented its own language
Back in September 2016, Google launched its Neural Machine Translation (GNMT) system, which uses deep learning to deliver more natural translations between languages. Google Translate originally supported only a handful of languages when it launched 10 years ago; today that number has risen to 103. Creating a computer system to translate multiple languages is complex. The people at Google who built it wanted to find out just how clever their system was. So they came up with a challenge.
The rise of AI translators - Raconteur
In The Hitchhiker's Guide to the Galaxy, writer Douglas Adams describes a "small, yellow, leech-like creature" called the Babel fish which "feeds on brain-wave energy, absorbing all unconscious frequencies and then excreting telepathically a matrix formed from the conscious frequencies and nerve signals picked up from the speech centres of the brain, the practical upshot of which is that if you stick one in your ear, you can instantly understand anything said to you in any form of language". Botanists have not discovered anything like the Babel fish, but the science fiction of universal translation is rapidly becoming reality thanks to technological advances. Most exciting for Hitchhiker fans is the Pilot earbud, backed by $3.5 million in crowdfunding raised by a startup called Waverly Labs. The company's chief executive Andrew Ochoa says: "We were really inspired with wearable technology and began working on the idea of a smart earpiece that could solve a global challenge. We were a small team back then, but we all came from different backgrounds and spoke different languages, and that's how we came up with the idea."
Legal artificial intelligence: Can it stand up in a court of law?
In his book Outliers, Malcolm Gladwell repeatedly mentions what has become known as the "10,000-hour rule", which states that to become world-class in any field you must devote 10,000 hours of "deliberate practice". Whether or not you believe the 10,000-hour figure, many would acknowledge that to become an accomplished legal professional requires considerable legal, communicative and, particularly in in-house environments, interpersonal skills that are often acquired after a tremendous amount of effort exerted over many years. There has been much hoopla about AI-based legal systems that, some might have you believe, may soon replace lawyers (no doubt causing a degree of anxiety among some legal professionals). There is some misunderstanding among many lawyers, and much of the public, about what AI systems are presently capable of. Can a legal AI, based on current technology, actually "think" like a lawyer?
Local minima in training of neural networks
Swirszcz, Grzegorz, Czarnecki, Wojciech Marian, Pascanu, Razvan
There has been a lot of recent interest in trying to characterize the error surface of deep models. This stems from a long standing question. Given that deep networks are highly nonlinear systems optimized by local gradient methods, why do they not seem to be affected by bad local minima? It is widely believed that training of deep models using gradient methods works so well because the error surface either has no local minima, or if they exist they need to be close in value to the global minimum. It is known that such results hold under very strong assumptions which are not satisfied by real models. In this paper we present examples showing that for such theorem to be true additional assumptions on the data, initialization schemes and/or the model classes have to be made. We look at the particular case of finite size datasets. We demonstrate that in this scenario one can construct counter-examples (datasets or initialization schemes) when the network does become susceptible to bad local minima over the weight space.
OpenNMT
Major source contributions and support come from SYSTRAN. Basically it is: "A Modularized Translation Program using Seq2Seq Attention Model" 3. Features of OpenNMT Simple general-purpose interface, requires only source/target files. Speed and memory optimizations for high-performance multi-GPU training. Includes a dependency-free C translator for model deployment. Latest research features to improve translation performance.
A Dependency-Based Neural Reordering Model for Statistical Machine Translation
Hadiwinoto, Christian (National University of Singapore) | Ng, Hwee Tou (National University of Singapore)
In machine translation (MT) that involves translating between two languages with significant differences in word order, determining the correct word order of translated words is a major challenge. The dependency parse tree of a source sentence can help to determine the correct word order of the translated words. In this paper, we present a novel reordering approach utilizing a neural network and dependency-based embeddings to predict whether the translations of two source words linked by a dependency relation should remain in the same order or should be swapped in the translated sentence. Experiments on Chinese-to-English translation show that our approach yields a statistically significant improvement of 0.57 BLEU point on benchmark NIST test sets, compared to our prior state-of-the-art statistical MT system that uses sparse dependency-based reordering features.
Neural Machine Translation Advised by Statistical Machine Translation
Wang, Xing (Soochow University) | Lu, Zhengdong (Noahโs Ark Lab, Huawei Technologies) | Tu, Zhaopeng (Noahโs Ark Lab, Huawei Technologies) | Li, Hang (Noahโs Ark Lab, Huawei Technologies) | Xiong, Deyi (Soochow University) | Zhang, Min (Soochow University)
Neural Machine Translation (NMT) is a new approach to machine translation that has made great progress in recent years. However, recent studies show that NMT generally produces fluent but inadequate translations (Tu et al. 2016b; 2016a; He et al. 2016; Tu et al. 2017). This is in contrast to conventional Statistical Machine Translation (SMT), which usually yields adequate but non-fluent translations. It is natural, therefore, to leverage the advantages of both models for better translations, and in this work we propose to incorporate SMT model into NMT framework. More specifically, at each decoding step, SMT offers additional recommendations of generated words based on the decoding information from NMT (e.g., the generated partial translation and attention history). Then we employ an auxiliary classifier to score the SMT recommendations and a gating function to combine the SMT recommendations with NMT generations, both of which are jointly trained within the NMT architecture in an end-to-end manner. Experimental results on Chinese-English translation show that the proposed approach achieves significant and consistent improvements over state-of-the-art NMT and SMT systems on multiple NIST test sets.
BattRAE: Bidimensional Attention-Based Recursive Autoencoders for Learning Bilingual Phrase Embeddings
Zhang, Biao (Xiamen University) | Xiong, Deyi (Soochow University) | Su, Jinsong (Xiamen University)
In this paper, we propose a bidimensional attention based recursiveautoencoder (BattRAE) to integrate clues and sourcetargetinteractions at multiple levels of granularity into bilingualphrase representations. We employ recursive autoencodersto generate tree structures of phrases with embeddingsat different levels of granularity (e.g., words, sub-phrases andphrases). Over these embeddings on the source and targetside, we introduce a bidimensional attention network to learntheir interactions encoded in a bidimensional attention matrix,from which we extract two soft attention weight distributionssimultaneously. These weight distributions enableBattRAE to generate compositive phrase representations viaconvolution. Based on the learned phrase representations, wefurther use a bilinear neural model, trained via a max-marginmethod, to measure bilingual semantic similarity. To evaluatethe effectiveness of BattRAE, we incorporate this semanticsimilarity as an additional feature into a state-of-the-art SMTsystem. Extensive experiments on NIST Chinese-English testsets show that our model achieves a substantial improvementof up to 1.63 BLEU points on average over the baseline.