"The challenges of machine learning have long been tied to games as a testbed for computer intelligence." Jeopardy Champion Emma Boettcher's Master's paper on using text mining to predict how hard a Jeopardy clue might be didn't win her a title on its own, but it is an interesting thought experiment. Futurism's mission is to empower our readers and drive the development of transformative technologies towards maximizing human potential.
David Ferrucci will deliver a keynote at the O'Reilly Artificial Intelligence Conference in NYC, June 26-29, 2017. His colleague Jennifer Chu-Caroll will also give a talk, "Beyond the state of the art in reading comprehension," at the same conference. Subscribe to the O'Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS. In this episode of the Data Show, I spoke with David Ferrucci, founder of Elemental Cognition and senior technologist at Bridgewater Associates.
The Artificial Intelligence revolution is here. We are moving further into an age, where the imagination stirred from our childhood spent watching movies, is now becoming reality. Leading us into this age are the typical (and untypical) tech giants, who are fiercely competing for the next break through. Project Oxford is Microsoft's venture into the world of artificial intelligence and deep learning. It takes in several key areas, including image, facial, text and speech recognition, and hopes to implement the technology into its computer operating systems and smartphone software.
Major AI researchers and textbooks define the field as "the study and design of intelligent agents", where an intelligent agent is a system that perceives its environment and takes actions that maximize its chances of success. John McCarthy, who coined the term in 1955, defines it as "The science and engineering of making intelligent machines". AI research is highly technical and specialized, deeply divided into subfields that often fail to communicate with each other. Some of the division is due to social and cultural factors: subfields have grown up around particular institutions and the work of individual researchers. AI research is also divided by several technical issues.
The Jeopardy Challenge helped us address requirements that led to the design of the DeepQA architecture and the implementation of Watson. After 3 years of intense research and development by a core team of about 20 researcherss, Watson is performing at human expert levels in terms of precision, confidence, and speed at the Jeopardy quiz show. Our results strongly suggest that DeepQA is an effective and extensible architecture that may be used as a foundation for combining, deploying, evaluating, and advancing a wide range of algorithmic techniques to rapidly advance the field of QA. The architecture and methodology developed as part of this project has highlighted the need to take a systems-level approach to research in QA, and we believe this applies to research in the broader field of AI. We have developed many different algorithms for addressing different kinds of problems in QA and plan to publish many of them in more detail in the future.
Major advances in Question Answering technology were needed for IBM Watson to play Jeopardy! at championship level -- the show requires rapid-fire answers to challenging natural language questions, broad general knowledge, high precision, and accurate confidence estimates. In addition, Jeopardy! features four types of decision making carrying great strategic importance: (1) Daily Double wagering; (2) Final Jeopardy wagering; (3) selecting the next square when in control of the board; (4) deciding whether to attempt to answer, i.e., "buzz in." Using sophisticated strategies for these decisions, that properly account for the game state and future event probabilities, can significantly boost a player's overall chances to win, when compared with simple "rule of thumb" strategies. This article presents our approach to developing Watson's game-playing strategies, comprising development of a faithful simulation model, and then using learning and Monte-Carlo methods within the simulator to optimize Watson's strategic decision-making. After giving a detailed description of each of our game-strategy algorithms, we then focus in particular on validating the accuracy of the simulator's predictions, and documenting performance improvements using our methods. Quantitative performance benefits are shown with respect to both simple heuristic strategies, and actual human contestant performance in historical episodes. We further extend our analysis of human play to derive a number of valuable and counterintuitive examples illustrating how human contestants may improve their performance on the show.
Ferrucci, David (IBM T. J. Watson Research Center) | Brown, Eric (IBM T. J. Watson Research Center) | Chu-Carroll, Jennifer (IBM T. J. Watson Research Center) | Fan, James (IBM T. J. Watson Research Center) | Gondek, David (IBM T. J. Watson Research Center) | Kalyanpur, Aditya A. (IBM T. J. Watson Research Center) | Lally, Adam (IBM T. J. Watson Research Center) | Murdock, J. William (IBM T. J. Watson Research Center) | Nyberg, Eric (Carnegie Mellon University) | Prager, John (IBM T. J. Watson Research Center) | Schlaefer, Nico (Carnegie Mellon University) | Welty, Chris (IBM T. J. Watson Research Center)
IBM Research undertook a challenge to build a computer system that could compete at the human champion level in real time on the American TV Quiz show, Jeopardy! The extent of the challenge includes fielding a real-time automatic contestant on the show, not merely a laboratory exercise. The Jeopardy! Challenge helped us address requirements that led to the design of the DeepQA architecture and the implementation of Watson. After 3 years of intense research and development by a core team of about 20 researches, Watson is performing at human expert-levels in terms of precision, confidence and speed at the Jeopardy! Quiz show. Our results strongly suggest that DeepQA is an effective and extensible architecture that may be used as a foundation for combining, deploying, evaluating and advancing a wide range of algorithmic techniques to rapidly advance the field of QA.