Machine Translation
Choosing the Right Metric for Evaluating Machine Learning Models – Part 1
In the world of postmodernism, Relativism has been, in its various guises, both one of the most popular and most reviled philosophical doctrines. According to Relativism, there is no universal and objective truth; rather each point of view has its own truth. You must be wondering why I am discussing it and how it is even related to Data Science. Well, in this post, I will be discussing the usefulness of each error metric depending on the objective and the problem we are trying to solve. When someone tells you that "USA is the best country", the first question that you should ask is on what basis is this statement being made.
English-Catalan Neural Machine Translation in the Biomedical Domain through the cascade approach
Costa-jussà, Marta R., Casas, Noe, Melero, Maite
This paper describes the methodology followed to build a neural machine translation system in the biomedical domain for the English-Catalan language pair. This task can be considered a low-resourced task from the point of view of the domain and the language pair. To face this task, this paper reports experiments on a cascade pivot strategy through Spanish for the neural machine translation using the English-Spanish SCIELO and Spanish-Catalan El Peri\'odico database. To test the final performance of the system, we have created a new test data set for English-Catalan in the biomedical domain which is freely available on request.
JUNIPR: a Framework for Unsupervised Machine Learning in Particle Physics
Andreassen, Anders, Feige, Ilya, Frye, Christopher, Schwartz, Matthew D.
In applications of machine learning to particle physics, a persistent challenge is how to go beyond discrimination to learn about the underlying physics. To this end, a powerful tool would be a framework for unsupervised learning, where the machine learns the intricate high-dimensional contours of the data upon which it is trained, without reference to pre-established labels. In order to approach such a complex task, an unsupervised network must be structured intelligently, based on a qualitative understanding of the data. In this paper, we scaffold the neural network's architecture around a leading-order model of the physics underlying the data. In addition to making unsupervised learning tractable, this design actually alleviates existing tensions between performance and interpretability. We call the framework JUNIPR: "Jets from UNsupervised Interpretable PRobabilistic models". In this approach, the set of particle momenta composing a jet are clustered into a binary tree that the neural network examines sequentially. Training is unsupervised and unrestricted: the network could decide that the data bears little correspondence to the chosen tree structure. However, when there is a correspondence, the network's output along the tree has a direct physical interpretation. JUNIPR models can perform discrimination tasks, through the statistically optimal likelihood-ratio test, and they permit visualizations of discrimination power at each branching in a jet's tree. Additionally, JUNIPR models provide a probability distribution from which events can be drawn, providing a data-driven Monte Carlo generator. As a third application, JUNIPR models can reweight events from one (e.g. simulated) data set to agree with distributions from another (e.g. experimental) data set.
The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation
Chen, Mia Xu, Firat, Orhan, Bapna, Ankur, Johnson, Melvin, Macherey, Wolfgang, Foster, George, Jones, Llion, Parmar, Niki, Schuster, Mike, Chen, Zhifeng, Wu, Yonghui, Hughes, Macduff
The past year has witnessed rapid advances in sequence-to-sequence (seq2seq) modeling for Machine Translation (MT). The classic RNN-based approaches to MT were first out-performed by the convolutional seq2seq model, which was then out-performed by the more recent Transformer model. Each of these new approaches consists of a fundamental architecture accompanied by a set of modeling and training techniques that are in principle applicable to other seq2seq architectures. In this paper, we tease apart the new architectures and their accompanying techniques in two ways. First, we identify several key modeling and training techniques, and apply them to the RNN architecture, yielding a new RNMT+ model that outperforms all of the three fundamental architectures on the benchmark WMT'14 English to French and English to German tasks. Second, we analyze the properties of each fundamental seq2seq architecture and devise new hybrid architectures intended to combine their strengths. Our hybrid models obtain further improvements, outperforming the RNMT+ model on both benchmark datasets.
Seq2Seq-Vis: A Visual Debugging Tool for Sequence-to-Sequence Models
Strobelt, Hendrik, Gehrmann, Sebastian, Behrisch, Michael, Perer, Adam, Pfister, Hanspeter, Rush, Alexander M.
Neural Sequence-to-Sequence models have proven to be accurate and robust for many sequence prediction tasks, and have become the standard approach for automatic translation of text. The models work in a five stage blackbox process that involves encoding a source sequence to a vector space and then decoding out to a new target sequence. This process is now standard, but like many deep learning methods remains quite difficult to understand or debug. In this work, we present a visual analysis tool that allows interaction with a trained sequence-to-sequence model through each stage of the translation process. The aim is to identify which patterns have been learned and to detect model errors. We demonstrate the utility of our tool through several real-world large-scale sequence-to-sequence use cases.
The Apps On Your Mobile That Use Machine Learning Algorithms
Seems like the term Machine Learning is popping up in mainstream media as the next big thing. The fact is, however, that Machine Learning went mainstream a long time ago. You don't think so? Check your mobile phone. Chances are you've been using and benefiting from Machine Learning algorithms all this time without even knowing it. In this blog post, I go through some of the many apps on your mobile phone that use Machine Learning algorithms to make recommendations, get you to your destination quickly and safely, improve your photos, tell you what song you're listening to and more.
QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension
Yu, Adams Wei, Dohan, David, Luong, Minh-Thang, Zhao, Rui, Chen, Kai, Norouzi, Mohammad, Le, Quoc V.
Current end-to-end machine reading and question answering (Q\&A) models are primarily based on recurrent neural networks (RNNs) with attention. Despite their success, these models are often slow for both training and inference due to the sequential nature of RNNs. We propose a new Q\&A architecture called QANet, which does not require recurrent networks: Its encoder consists exclusively of convolution and self-attention, where convolution models local interactions and self-attention models global interactions. On the SQuAD dataset, our model is 3x to 13x faster in training and 4x to 9x faster in inference, while achieving equivalent accuracy to recurrent models. The speed-up gain allows us to train the model with much more data. We hence combine our model with data generated by backtranslation from a neural machine translation model. On the SQuAD dataset, our single model, trained with augmented data, achieves 84.6 F1 score on the test set, which is significantly better than the best published F1 score of 81.8.
Microsoft Translator gets offline AI translations
Chances are you mostly need a translator app on your phone while you are traveling. But that's also when you are most likely to not have any connectivity. While most translation apps still work when they are offline, they can't use the sophisticated -- and computationally intense -- machine learning algorithms in the cloud that typically power them. Until now, that was also the case for the Microsoft Translator app on Amazon Fire, Android and iOS, but starting today, the app will actually run a slightly modified neural translation when offline (though iOS users may still have to wait a few days, as the update still has to be approved by Apple). What's interesting about this is that Microsoft is able to do this on virtually any modern phone and that there is no need for a custom AI chip in them.
Microsoft Translator gets offline AI translations support
Microsoft has announced that its Translator app for Android, iOS and Amazon Fire tablets will now support AI translations even when the device is offline. Translator comes really handy when you traveling to a foreign country and you are not familiar with the local language. But since the connectivity is mostly limited when you traveling, users are left with basic translations on their mobile applications. With the new udpate for Microsoft Translator, the Redmond-based software major wants to change that narrative altogether. Microsoft says its Translator app will be able to use sophisticated algorithms and computational power for translation even when the device is not connected to internet.