"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).
After half a century of hype and false starts, artificial intelligence may finally be starting to transform the U.S. economy. An example is machine translation, as we found when analyzing eBay's deployment in 2014 of an AI-based tool that learned to translate by digesting millions of lines of eBay data and data from the Web. The aim is to allow eBay sellers and buyers in different countries to more easily connect with one another. The tool detects the location of an eBay user's Internet Protocol address in, say, a Spanish-speaking country and automatically translates the English title of the eBay offering. After eBay unveiled its English-Spanish translator for search queries and item titles, exports on eBay from the United States to Latin America increased by more than 17 percent.
Erik Brynjolfsson is the director of the MIT Initiative on the Digital Economy and co-author, with Andrew McAfee, of "Machine/Platform/Crowd." Xiang Hui is an assistant professor of marketing at Washington University, where Meng Liu is a visiting assistant professor of marketing; both are research fellows at the MIT initiative. After half a century of hype and false starts, artificial intelligence may finally be starting to transform the U.S. economy. An example is machine translation, as we found when analyzing eBay's deployment in 2014 of an AI-based tool that learned to translate by digesting millions of lines of eBay data and data from the Web. The aim is to allow eBay sellers and buyers in different countries to more easily connect with one another. The tool detects the location of an eBay user's Internet Protocol address in, say, a Spanish-speaking country and automatically translates the English title of the eBay offering.
Nowadays, you see a lot of user reviews of products and services on sites such as TripAdvisor, Yelp and Amazon. Most of the people read these peer reviews and trust what they see without knowing that not all of them are legitimate. Some of the reviews are fake. In fact, up to 40 per cent of users decide to make a purchase based on only a couple of reviews and great reviews make people spend 30 per cent more on their purchases. To combat this, an artificial intelligence (AI) system has been developed by the scientists that can identify machine-generated fake reviews on online e-commerce websites.
While current state-of-the-art NMT models, such as RNN seq2seq and Transformers, possess a large number of parameters, they are still shallow in comparison to convolutional models used for both text and vision applications. In this work we attempt to train significantly (2-3x) deeper Transformer and Bi-RNN encoders for machine translation. We propose a simple modification to the attention mechanism that eases the optimization of deeper models, and results in consistent gains of 0.7-1.1 BLEU on the benchmark WMT'14 English-German and WMT'15 Czech-English tasks for both architectures.
While modern machine translation has relied on large parallel corpora, a recent line of work has managed to train Neural Machine Translation (NMT) systems from monolingual corpora only (Artetxe et al., 2018c; Lample et al., 2018). Despite the potential of this approach for low-resource settings, existing systems are far behind their supervised counterparts, limiting their practical interest. In this paper, we propose an alternative approach based on phrase-based Statistical Machine Translation (SMT) that significantly closes the gap with supervised systems. Our method profits from the modular architecture of SMT: we first induce a phrase table from monolingual corpora through cross-lingual embedding mappings, combine it with an n-gram language model, and fine-tune hyperparameters through an unsupervised MERT variant. In addition, iterative backtranslation improves results further, yielding, for instance, 14.08 and 26.22 BLEU points in WMT 2014 English-German and English-French, respectively, an improvement of more than 7-10 BLEU points over previous unsupervised systems, and closing the gap with supervised SMT (Moses trained on Europarl) down to 2-5 BLEU points. Our implementation is available at https://github.com/artetxem/monoses
Sequence-to-Sequence models were introduced to tackle many real-life problems like machine translation, summarization, image captioning, etc. The standard optimization algorithms are mainly based on example-to-example matching like maximum likelihood estimation, which is known to suffer from data sparsity problem. Here we present an alternate view to explain sequence-to-sequence learning as a distribution matching problem, where each source or target example is viewed to represent a local latent distribution in the source or target domain. Then, we interpret sequence-to-sequence learning as learning a transductive model to transform the source local latent distributions to match their corresponding target distributions. In our framework, we approximate both the source and target latent distributions with recurrent neural networks (augmenter). During training, the parallel augmenters learn to better approximate the local latent distributions, while the sequence prediction model learns to minimize the KL-divergence of the transformed source distributions and the approximated target distributions. This algorithm can alleviate the data sparsity issues in sequence learning by locally augmenting more unseen data pairs and increasing the model's robustness. Experiments conducted on machine translation and image captioning consistently demonstrate the superiority of our proposed algorithm over the other competing algorithms.
Humans have been pondering the potential of artificial intelligence for thousands of years. Ancient Greeks believed, for example, that a bronze automaton named Talos protected the island of Crete from maritime adversaries. But AI only moved from the mythical realm to the real world in the last half-century, beginning with legendary computer scientist Alan Turing's foundational 1950 essay asked and provided a framework for answering the provocative question, "Can machines think?" At that time, the United States was in the midst of the Cold War. Congressional representatives decided to invest heavily in artificial intelligence as part of a larger security strategy.
Recent studies have shown that reinforcement learning (RL) is an effective approach for improving the performance of neural machine translation (NMT) system. However, due to its instability, successfully RL training is challenging, especially in real-world systems where deep models and large datasets are leveraged. In this paper, taking several large-scale translation tasks as testbeds, we conduct a systematic study on how to train better NMT models using reinforcement learning. We provide a comprehensive comparison of several important factors (e.g., baseline reward, reward shaping) in RL training. Furthermore, to fill in the gap that it remains unclear whether RL is still beneficial when monolingual data is used, we propose a new method to leverage RL to further boost the performance of NMT systems trained with source/target monolingual data. By integrating all our findings, we obtain competitive results on WMT14 English- German, WMT17 English-Chinese, and WMT17 Chinese-English translation tasks, especially setting a state-of-the-art performance on WMT17 Chinese-English translation task.
Most of the Neural Machine Translation (NMT) models are based on the sequence-to-sequence (Seq2Seq) model with an encoder-decoder framework equipped with the attention mechanism. However, the conventional attention mechanism treats the decoding at each time step equally with the same matrix, which is problematic since the softness of the attention for different types of words (e.g. content words and function words) should differ. Therefore, we propose a new model with a mechanism called Self-Adaptive Control of Temperature (SACT) to control the softness of attention by means of an attention temperature. Experimental results on the Chinese-English translation and English-Vietnamese translation demonstrate that our model outperforms the baseline models, and the analysis and the case study show that our model can attend to the most relevant elements in the source-side contexts and generate the translation of high quality.
Artificial intelligence (AI) is surpassing human performance in a growing number of domains. However, there is limited evidence of its economic effects. Using data from a digital platform, we study a key application of AI: machine translation. We find that the introduction of a machine translation system has significant increased international trade on this platform, increasing exports by 17.5%. Furthermore, heterogeneous treatment effects are all consistent with a substantial reduction in translation-related search costs.