A positive label means that an utterance was an actual response to a context, and a negative label means that the utterance wasn't – it was picked randomly from somewhere in the corpus. Each record in the test/validation set consists of a context, a ground truth utterance (the real response) and 9 incorrect utterances called distractors. Before starting with fancy Neural Network models let's build some simple baseline models to help us understand what kind of performance we can expect. The Deep Learning model we will build in this post is called a Dual Encoder LSTM network.
Last week, machine learning took a big leap forward when Google's AlphaGo, a machine algorithm, beat the world champion, Lee Sedol, in the game Go. When IBM Watson beat former champions Ken Jennings and Brad Rutter in the game show Jeopardy! Even though it doesn't rely on encoded rules, IBM Watson requires close monitoring by domain experts to provide data and evaluate its performance. AlphaGo was programmed to seek positive rewards in the form of scores and continually improve its system by playing millions of games against tweaked versions of itself.