Machine Translation
Estimating Text Intelligibility via Information Packaging Analysis
Li, Junyi Jessy (University of Pennsylvania)
Effective communication through language involves organizing the content a person or system wishes to convey into text that flows naturally. There are many ways to render the same information, but those appropriate for one group of audience may not be intelligible to another. The goal of this thesis to analyze and address factors that influence the intelligibility of text from two aspects of information packaging: discourse structure and text specificity. Effective communication through language involves organizing the content a person or system wishes to convey into text that flows naturally. There are many ways to render the same information, but those appropriate for one group of audience may not be intelligible to another. The goal of this thesis to analyze and address factors that influence the intelligibility of text from two aspects of information packaging: discourse structure and text specificity.
To Swap or Not to Swap? Exploiting Dependency Word Pairs for Reordering in Statistical Machine Translation
Hadiwinoto, Christian (National University of Singapore) | Liu, Yang (Tsinghua University) | Ng, Hwee Tou (National University of Singapore)
Reordering poses a major challenge in machine translation (MT) between two languages with significant differences in word order. In this paper, we present a novel reordering approach utilizing sparse features based on dependency word pairs. Each instance of these features captures whether two words, which are related by a dependency link in the source sentence dependency parse tree, follow the same order or are swapped in the translation output. Experiments on Chinese-to-English translation show a statistically significant improvement of 1.21 BLEU point using our approach, compared to a state-of-the-art statistical MT system that incorporates prior reordering approaches.
Building Earth Mover's Distance on Bilingual Word Embeddings for Machine Translation
Zhang, Meng (Tsinghua University) | Liu, Yang (Tsinghua University) | Luan, Huanbo (Tsinghua University) | Sun, Maosong (Tsinghua University) | Izuha, Tatsuya (Toshiba Corporation Corporate Research &) | Hao, Jie (Development Center)
Following their monolingual counterparts, bilingual word embeddings are also on the rise. As a major application task, word translation has been relying on the nearest neighbor to connect embeddings cross-lingually. However, the nearest neighbor strategy suffers from its inherently local nature and fails to cope with variations in realistic bilingual word embeddings. Furthermore, it lacks a mechanism to deal with many-to-many mappings that often show up across languages. We introduce Earth Mover's Distance to this task by providing a natural formulation that translates words in a holistic fashion, addressing the limitations of the nearest neighbor. We further extend the formulation to a new task of identifying parallel sentences, which is useful for statistical machine translation systems, thereby expanding the application realm of bilingual word embeddings. We show encouraging performance on both tasks.
Syntactic Skeleton-Based Translation
Xiao, Tong (Northeastern University) | Zhu, Jingbo (Northeastern University) | Zhang, Chunliang (Northeastern University) | Liu, Tongran (Institute of Psychology (CAS))
In this paper we propose an approach to modeling syntactically-motivated skeletal structure of source sentence for machine translation. This model allows for application of high-level syntactic transfer rules and low-level non-syntactic rules. It thus involves fully syntactic, non-syntactic, and partially syntactic derivations via a single grammar and decoding paradigm. On large-scale Chinese-English and English-Chinese translation tasks, we obtain an average improvement of +0.9 BLEU across the newswire and web genres.
Improved Neural Machine Translation with SMT Features
He, Wei (Baidu Inc.) | He, Zhongjun (Baidu Inc.) | Wu, Hua (Baidu Inc.) | Wang, Haifeng (Baidu Inc.)
Neural machine translation (NMT) conducts end-to-end translation with a source language encoder and a target language decoder, making promising translation performance. However, as a newly emerged approach, the method has some limitations. An NMT system usually has to apply a vocabulary of certain size to avoid the time-consuming training and decoding, thus it causes a serious out-of-vocabulary problem. Furthermore, the decoder lacks a mechanism to guarantee all the source words to be translated and usually favors short translations, resulting in fluent but inadequate translations. In order to solve the above problems, we incorporate statistical machine translation (SMT) features, such as a translation model and an n-gram language model, with the NMT model under the log-linear framework. Our experiments show that the proposed method significantly improves the translation quality of the state-ofthe-art NMT system on Chinese-to-English translation tasks. Our method produces a gain of up to 2.33 BLEU score on NIST open test sets.
Can machines 'learn' or 'think'? - raconteur.net
The marriage of computing power and data is finally bearing fruit in the field of cognitive computing, sometimes called machine learning or, more controversially, artificial intelligence. In its most everyday form, we see it in tools such as Google Translate or Microsoft's Bing Translate, which can translate phrases and documents effortlessly across multiple languages. More futuristically, the promise of self-driving vehicles, which can complete entire road journeys without driver intervention, is already being realised. Yet the biggest revolution in work is happening at some of the most basic levels, such as reading and dissecting legal documents to extract meaning and useful information. The tedious slog of work can be transformed by computers which are able to read and parse legal phrases, and summarise them or enter relevant details into a database or spreadsheet.
Microsoft beats Google to offline translation on iOS
When Microsoft launched the offline functionality for Android, it was really bringing the experience in line with Google's offering on the platform. But while the search giant's Translate app for Android does offline translation of text (and even photos containing text), its iOS app is online-only. That makes Microsoft's Translate app the first from a major company to offer the functionality, and the first ever on the platform to use a neural network to achieve it. The iOS app supports 43 languages, although you'll have to download the relevant libraries before going offline. That's a lot more than the nine the Android version launched with, but Microsoft says it's updating that app to support the expanded catalog.
Unlike Google Translate, Microsoft Translator for iOS now works offline
Microsoft today announced that its Microsoft Translator app for iOS devices can now translate text and images from one language to another even when you're offline. The app already supported this functionality on Android, and the competing Google Translate for Android could work offline, too. But in this case, Microsoft has beat Google to the punch -- Google Translate currently works offline only on Android. "Until now, iPhone users needed an Internet connection if they wanted to translate on their mobile devices. Now, by downloading the Microsoft Translator app and the needed offline language packs, iOS users can get near online-quality translations even when they are not connected to the Internet. This means no expensive roaming charges or not being able to communicate when a data connection is spotty or unavailable," the Microsoft Translator team wrote in a blog post.
Unlike Google Translate, Microsoft Translator for iOS now works offline
Microsoft today announced that its Microsoft Translator app for iOS devices can now translate text and images from one language to another even when you're offline. The app already supported this functionality on Android, and the competing Google Translate for Android could work offline, too. But in this case, Microsoft has beat Google to the punch -- Google Translate currently works offline only on Android. "Until now, iPhone users needed an Internet connection if they wanted to translate on their mobile devices. Now, by downloading the Microsoft Translator app and the needed offline language packs, iOS users can get near online-quality translations even when they are not connected to the Internet. This means no expensive roaming charges or not being able to communicate when a data connection is spotty or unavailable," the Microsoft Translator team wrote in a blog post.
Why machine learning will impact, but not take, your job Information Age
Artificial intelligence is being used all around is, but it looks nothing like The Jetsons. So why are people panicked that robots will take their jobs? The World Economic Forum warned that robots and technological advances will take more than 5 million jobs from humans over the next five years. Machine learning has undoubtedly earned its place in the workforce, but machines don't necessarily have to replace humans – they can in fact enhance the work humans can do. One area where machine learning is flourishing is in the localisation and translation industry.