Google AI: Introducing the Schema-Guided Dialogue Dataset for Conversational Assistants


This research summary is just one of many that are distributed weekly on the AI scholar newsletter. To start receiving the weekly newsletter, sign up here. Conversational assistants are one of the most interesting AI advances that we have witnessed recently. So far, we have seen them increasingly become a meaningful part of our personal lives as well as businesses to improve customer service. No doubt the future of these assistants is exciting and will keep expanding -- the smart virtual assistant market is estimated to grow at a CAGR of more than 26 % to reach over $12 billion U.S. dollars by 2024.

WikiTableQuestions: a Complex Real-World Question Understanding Dataset - The Stanford Natural Language Processing Group


Natural language question understanding has been one of the most important challenges in artificial intelligence. Indeed, eminent AI benchmarks such as the Turing test require an AI system to understand natural language questions, with various topics and complexity, and then respond appropriately. During the past few years, we have witnessed rapid progress in question answering technology, with virtual assistants like Siri, Google Now, and Cortana answering daily life questions, and IBM Watson winning over humans in Jeopardy!. Many questions the systems encounter are simple lookup questions (e.g., "Where is Chichen Itza?" or "Who's the manager of Man Utd?"). The answers can be found by searching the surface forms.

The Best Public Datasets for Machine Learning


First, a couple of pointers to keep in mind when searching for datasets. Kaggle: A data science site that contains a variety of externally contributed interesting datasets. You can find all kinds of niche datasets in its master list, from ramen ratings to basketball data to and even seattle pet licenses. Although the data sets are user-contributed, and thus have varying levels of cleanliness, the vast majority are clean. VisualData: Discover computer vision datasets by category, it allows searchable queries.