A positive label means that an utterance was an actual response to a context, and a negative label means that the utterance wasn't – it was picked randomly from somewhere in the corpus. Each record in the test/validation set consists of a context, a ground truth utterance (the real response) and 9 incorrect utterances called distractors. Before starting with fancy Neural Network models let's build some simple baseline models to help us understand what kind of performance we can expect. The Deep Learning model we will build in this post is called a Dual Encoder LSTM network.
Scientists have used deep learning algorithms with multiple processing layers (hence "deep") to make better models from large quantities of unlabeled data (such as photos with no description, voice recordings or videos on YouTube). Google's voice recognition algorithms operate with a massive training set -- yet it's not nearly big enough to predict every possible word or phrase or question you could put to it. And Google's deep learning algorithm discovers cats. Algorithms perform superior face recognition tasks using deep network that take into account 120 million parameters.
Google, Baidu, and Microsoft have the resources to build dedicated deep learning clusters that give the deep learning algorithms a level of processing power that both accelerates training time as well as increases their model's accuracy. Yahoo, however, has taken a slightly different approach, by moving away from a dedicated deep learning cluster and combining Caffe with Spark. The ML Big Data team's CaffeOnSpark software has allowed them to run the entire process of building and deploying a deep learning model onto a single cluster. The MapR Converged Data Platform is the ideal platform for this project, giving you all the power of distributed Caffe on a cluster with enterprise-grade robustness, enabling you to take advantage of the MapR high performance file system.
Last week, machine learning took a big leap forward when Google's AlphaGo, a machine algorithm, beat the world champion, Lee Sedol, in the game Go. When IBM Watson beat former champions Ken Jennings and Brad Rutter in the game show Jeopardy! Even though it doesn't rely on encoded rules, IBM Watson requires close monitoring by domain experts to provide data and evaluate its performance. AlphaGo was programmed to seek positive rewards in the form of scores and continually improve its system by playing millions of games against tweaked versions of itself.