Machine Translation
Learn to Talk via Proactive Knowledge Transfer
Knowledge Transfer has been applied in solving a wide variety of problems. For example, knowledge can be transferred between tasks (e.g., learning to handle novel situations by leveraging prior knowledge) or between agents (e.g., learning from others without direct experience). Without loss of generality, we relate knowledge transfer to KL-divergence minimization, i.e., matching the (belief) distributions of learners and teachers. The equivalence gives us a new perspective in understanding variants of the KL-divergence by looking at how learners structure their interaction with teachers in order to acquire knowledge. In this paper, we provide an in-depth analysis of KL-divergence minimization in Forward and Backward orders, which shows that learners are reinforced via on-policy learning in Backward. In contrast, learners are supervised in Forward. Moreover, our analysis is gradient-based, so it can be generalized to arbitrary tasks and help to decide which order to minimize given the property of the task. By replacing Forward with Backward in Knowledge Distillation, we observed +0.7-1.1 BLEU gains on the WMT'17 De-En and IWSLT'15 Th-En machine translation tasks.
Neural Machine Translation without Embeddings
Many NLP models follow the embed-contextualize-predict paradigm, in which each sequence token is represented as a dense vector via an embedding matrix, and fed into a contextualization component that aggregates the information from the entire sequence in order to make a prediction. Could NLP models work without the embedding component? To that end, we omit the input and output embeddings from a standard machine translation model, and represent text as a sequence of bytes via UTF-8 encoding, using a constant 256-dimension one-hot representation for each byte. Experiments on 10 language pairs show that removing the embedding matrix consistently improves the performance of byte-to-byte models, often outperforms character-to-character models, and sometimes even produces better translations than standard subword models.
Why Should You Patent Your AI Inventions
The ease with which you shop at Amazon and scroll though different products that are customised specific to your taste is the result of AI technology that analyses and predicts your shopping behavior. There are countless AI start-ups who are founded for the primary reason of using AI to make the world a better place. The challenges that the world faces today โ primarily climate change โ is a huge motivator for AI innovation. Every AI idea is bound to help us face such challenges. In 2018, venture funding for AI grew to about 9.3 billion dollars in U.S. alone.
6 Common Applications of Machine Learning That Are Hiding in Plain Sight
Machine Learning, a sub-branch of Artificial Intelligence, has established itself as the new go-to technology for businesses worldwide. Whether it is e-commerce or healthcare, almost all the industries are using Machine Learning extensively to make futuristic solutions and products. Machine Learning depends heavily on programs and algorithms that help machines self-learn without having to be instructed explicitly. Machine Learning is pretty much dictating our daily lives- how, you wonder? Let's look at the top applications of Machine Learning to understand how it is shaping the digital economy.
Generative Models are Unsupervised Predictors of Page Quality: A Colossal-Scale Study
Bahri, Dara, Tay, Yi, Zheng, Che, Metzler, Donald, Brunk, Cliff, Tomkins, Andrew
Large generative language models such as GPT-2 are well-known for their ability to generate text as well as their utility in supervised downstream tasks via fine-tuning. Our work is twofold: firstly we demonstrate via human evaluation that classifiers trained to discriminate between human and machine-generated text emerge as unsupervised predictors of "page quality", able to detect low quality content without any training. This enables fast bootstrapping of quality indicators in a low-resource setting. Secondly, curious to understand the prevalence and nature of low quality pages in the wild, we conduct extensive qualitative and quantitative analysis over 500 million web articles, making this the largest-scale study ever conducted on the topic.
On Learning Language-Invariant Representations for Universal Machine Translation
Zhao, Han, Hu, Junjie, Risteski, Andrej
The goal of universal machine translation is to learn to translate between any pair of languages, given a corpus of paired translated documents for \emph{a small subset} of all pairs of languages. Despite impressive empirical results and an increasing interest in massively multilingual models, theoretical analysis on translation errors made by such universal machine translation models is only nascent. In this paper, we formally prove certain impossibilities of this endeavour in general, as well as prove positive results in the presence of additional (but natural) structure of data. For the former, we derive a lower bound on the translation error in the many-to-many translation setting, which shows that any algorithm aiming to learn shared sentence representations among multiple language pairs has to make a large translation error on at least one of the translation tasks, if no assumption on the structure of the languages is made. For the latter, we show that if the paired documents in the corpus follow a natural \emph{encoder-decoder} generative process, we can expect a natural notion of ``generalization'': a linear number of language pairs, rather than quadratic, suffices to learn a good representation. Our theory also explains what kinds of connection graphs between pairs of languages are better suited: ones with longer paths result in worse sample complexity in terms of the total number of documents per language pair needed. We believe our theoretical insights and implications contribute to the future algorithmic design of universal machine translation.
Creative AI Through Evolutionary Computation: Principles and Examples
In the last decade or so we have seen tremendous progress in Artificial Intelligence (AI). AI is now in the real world, powering applications that have a large practical impact. Most of it is based on modeling, i.e. machine learning of statistical models that make it possible to predict what the right decision might be in future situations. For example, we now have object recognition, speech recognition, game playing, language understanding, and machine translation systems that rival human performance, and in many cases exceed it [28, 10, 9]. In each of these cases, massive amounts of supervised data exists, specifying the right answer to each input case.
Why 'human-like' is a low bar for most AI projects
The AI market is expected to eclipse $300 billion by 2025. And the vast majority of the companies trying to cash in on that bonanza are marketing some form of "human-like" AI. Maybe it's time to reconsider that approach. The big idea is that human-like AI is an upgrade. Computers compute, but AI can learn.
Incremental Text to Speech for Neural Sequence-to-Sequence Models using Reinforcement Learning
Mohan, Devang S Ram, Lenain, Raphael, Foglianti, Lorenzo, Teh, Tian Huey, Staib, Marlene, Torresquintero, Alexandra, Gao, Jiameng
Modern approaches to text to speech require the entire input character sequence to be processed before any audio is synthesised. This latency limits the suitability of such models for time-sensitive tasks like simultaneous interpretation. Interleaving the action of reading a character with that of synthesising audio reduces this latency. However, the order of this sequence of interleaved actions varies across sentences, which raises the question of how the actions should be chosen. We propose a reinforcement learning based framework to train an agent to make this decision. We compare our performance against that of deterministic, rule-based systems. Our results demonstrate that our agent successfully balances the trade-off between the latency of audio generation and the quality of synthesised audio. More broadly, we show that neural sequence-to-sequence models can be adapted to run in an incremental manner.
We Need to Talk About Linguistic Diversity in AI
Of the 7,117 living languages currently known, Apple's Siri supports 21, Amazon Alexa eight, and Google Home 13. Our learned ability to use words to construct sentences that convey information, ideas, and emotions in an organized way makes us unique among animals. However, language has significance beyond communication. It is an expression of cultural identity, a demonstration of the existence of communities of peoples. According to Ethnologue: Languages of the World, there are currently 7,117 known living languages.