Allowing humans to interactively train artificial agents to understand language instructions is desirable for both practical and scientific reasons, but given the poor data efficiency of the current learning methods, this goal may require substantial research efforts. Here, we introduce the BabyAI research platform to support investigations towards including humans in the loop for grounded language learning. The BabyAI platform comprises an extensible suite of 19 levels of increasing difficulty. The levels gradually lead the agent towards acquiring a combinatorially rich synthetic language which is a proper subset of English. The platform also provides a heuristic expert agent for the purpose of simulating a human teacher. We report baseline results and estimate the amount of human involvement that would be required to train a neural network-based agent on some of the BabyAI levels. We put forward strong evidence that current deep learning methods are not yet sufficiently sample efficient when it comes to learning a language with compositional properties. How can a human train an intelligent agent to understand natural language instructions? We believe that this research question is important from both technological and scientific perspectives. No matter how advanced AI technology becomes, human users may want to customize their intelligent helpers to be able to better understand their desires and needs.
Using artificial intelligence to translate sign language in real time - see how we used Python to train a neural network with 86% accuracy in less than a day. Imagine a world where anyone can communicate using sign language over video. Inspired by this vision, some of our engineering team decided to bring this idea to HealthHack 2018. In less than 48 hours and using the power of artificial intelligence, their team was able to produce a working prototype which translated signs from the Auslan alphabet to English text in real time. People who are hearing impaired are left behind in video consultations.
Cloze tests are widely adopted in language exams to evaluate students' language proficiency. In this paper, we propose the first large-scale human-created cloze test dataset CLOTH, containing questions used in middle-school and high-school language exams. With missing blanks carefully created by teachers and candidate choices purposely designed to be nuanced, CLOTH requires a deeper language understanding and a wider attention span than previously automatically-generated cloze datasets. We test the performance of dedicatedly designed baseline models including a language model trained on the One Billion Word Corpus and show humans outperform them by a significant margin. We investigate the source of the performance gap, trace model deficiencies to some distinct properties of CLOTH, and identify the limited ability of comprehending the long-term context to be the key bottleneck.
It seems like voice interfaces are going to be a big part of the future of computing; popping up in phones, smart speakers, and even household appliances. But how useful is this technology for people who don't communicate using speech? Are we creating a system that locks out certain users? These were the questions that inspired software developer Abhishek Singh to create a mod that lets Amazon's Alexa assistant understand some simple sign language commands. In a video, Singh demonstrates how the system works.
When we discuss about artificial intelligence (AI), how are machines learning? What kinds of projects feed into greater understanding? For our friends over at IBM, one surprising answer is movies. To build smarter AI systems, IBM researchers are using movie plots and neural networks to explore new ways of enhancing the language understanding capabilities of AI models. IBM will present key findings from two papers on these topics at the Association for Computational Linguistics (ACL) annual meeting this week in Melbourne, Australia.
Memrise, a UK startup whose eponymous language-learning app employs machine learning and localised content to adapt to users' needs as they progress through their lessons, has raised another $15.5 million in funding to expand its product. The funding comes after a period of strong growth: Memrise has now passed 35 million users globally across its 20 language courses, and it tipped into profitability in Q1 of this year. Ed Cooke, who co-founded the app with Ben Whately and Greg Detre, told TechCrunch that this places it as the second-most popular language app globally in terms of both users and revenues. This round, a Series B, was led by Octopus Ventures and Korelya Capital, along with participation from existing investors Avalon Ventures and Balderton Capital. Memrise is not disclosing its valuation -- it has raised a relatively modest $22 million to date -- but Cooke (who is also the CEO) said the plan will be to use the funding to expand its AI platform and add in more features for users.
For many AI services, it is critical to be able to comprehend human language and even converse in it with human users. So far, advances in natural language processing (NLP) powered with "sub-symbolic" machine learning based on deep neural networks allows us to solve multiple tasks like machine translation, classification, and emotion recognition. However, using these approaches requires enormous amount of training. Additionally, there are increasing legal restrictions in particular applications due to recent regulations, making current solutions unviable. The ultimate goal for these industry initiatives is to allow humans and AI to interact fluently in a common language.
We propose a two-stage neural model to tackle question generation from documents. First, our model estimates the probability that word sequences in a document are ones that a human would pick when selecting candidate answers by training a neural key-phrase extractor on the answers in a question-answering corpus. Predicted key phrases then act as target answers and condition a sequence-to-sequence question-generation model with a copy mechanism. Empirically, our key-phrase extraction model significantly outperforms an entity-tagging baseline and existing rule-based approaches. We further demonstrate that our question generation system formulates fluent, answerable questions from key phrases. This two-stage system could be used to augment or generate reading comprehension datasets, which may be leveraged to improve machine reading systems or in educational settings.
Building intelligent agents that can communicate with and learn from humans in natural language is of great value. Supervised language learning is limited by the ability of capturing mainly the statistics of training data, and is hardly adaptive to new scenarios or flexible for acquiring new knowledge without inefficient retraining or catastrophic forgetting. We highlight the perspective that conversational interaction serves as a natural interface both for language learning and for novel knowledge acquisition and propose a joint imitation and reinforcement approach for grounded language learning through an interactive conversational game. The agent trained with this approach is able to actively acquire information by asking questions about novel objects and use the just-learned knowledge in subsequent conversations in a one-shot fashion. Results compared with other methods verified the effectiveness of the proposed approach.
Towards the vision of translating code that implements an algorithm from one programming language into another, this paper proposes an approach for automated program classification using bilateral tree-based convolutional neural networks (BiTBCNNs). It is layered on top of two tree-based convolutional neural networks (TBCNNs), each of which recognizes the algorithm of code written in an individual programming language. The combination layer of the networks recognizes the similarities and differences among code in different programming languages. The BiTBCNNs are trained using the source code in different languages but known to implement the same algorithms and/or functionalities. For a preliminary evaluation, we use 3591 Java and 3534 C++ code snippets from 6 algorithms we crawled systematically from GitHub. We obtained over 90% accuracy in the cross-language binary classification task to tell whether any given two code snippets implement a same algorithm. Also, for the algorithm classification task, i.e., to predict which one of the six algorithm labels is implemented by an arbitrary C++ code snippet, we achieved over 80% precision.