Many areas of AI today use benchmarks and competitions with larger and wider sets of tasks. This tries to deter AI systems (and research effort) from specialising to a single task, and encourage them to be prepared to solve previously unseen tasks. It is unclear, however, whether the methods with best performance are actually those that are most general and, in perspective, whether the trend moves towards more general AI systems. This question has a striking similarity with the analysis of the so-called positive manifold and general factors in the area of human intelligence. In this paper, we first show how the existence of a manifold (positive average pairwise task correlation) can also be analysed in AI, and how this relates to the notion of agent generality, from the individual and the populational points of view. From the populational perspective, we analyse the following question: is this manifold correlation higher for the most or for the least able group of agents? We contrast this analysis with one of the most controversial issues in human intelligence research, the so-called Spearman's Law of Diminishing Returns (SLODR), which basically states that the relevance of a general factor diminishes for most able human groups. We perform two empirical studies on these issues in AI. We analyse the results of the 2015 general video game AI (GVGAI) competition, with games as tasks and "controllers" as agents, and the results of a synthetic setting, with modified elementary cellular automata (ECA) rules as tasks and simple interactive programs as agents. In both cases, we see that SLODR doesnot appear. The data, and the use of just two scenarios, does not clearly support the reverse either, a Universal Law of Augmenting Returns (ULOAR), but calls for more experiments on this question.
AI-complete problems are ones likely to contain all or most of human-level general artificial intelligence. A few problems in this category are listed below. Open-domain dialog is the problem of conducting competently a dialog with a human when the subject of the discussion is not known in advance. The challenge includes language understanding, dialog pragmatics, and understanding the world. Versions of the tasks include spoken and written dialog.
The Minecraft video game is popular with children; now a version is being used to test artificial-intelligence programs. The Minecraft video game was familiar to José Hernández-Orallo long before he started using it for his own research. The computer scientist, who devises ways to benchmark machine intelligence at the Polytechnic University of Valencia in Spain, first watched his own children play inside the 3D virtual world, which focuses on solving problems rather than shooting monsters. In 2014, Microsoft bought Minecraft, and its science arm, Microsoft Research, gave its own researchers access to a new version of the game that allowed computer programs, as well as people, to explore and customize the 3D environment. Then, after inviting a small group of outside researchers that included Hernández-Orallo to download the machine-friendly version of the world, last July, Microsoft made it freely available to anyone, with the goal of speeding up progress in artificial intelligence (AI).
Item response theory (IRT) can be applied to the analysis of the evaluation of results from AI benchmarks. The two-parameter IRT model provides two indicators (difficulty and discrimination) on the side of the item (or AI problem) while only one indicator (ability) on the side of the respondent (or AI agent). In this paper we analyse how to make this set of indicators dual, by adding a fourth indicator, generality, on the side of the respondent. Generality is meant to be dual to discrimination, and it is based on difficulty. Namely, generality is defined as a new metric that evaluates whether an agent is consistently good at easy problems and bad at difficult ones. With the addition of generality, we see that this set of four key indicators can give us more insight on the results of AI benchmarks. In particular, we explore two popular benchmarks in AI, the Arcade Learning Environment (Atari 2600 games) and the General Video Game AI competition. We provide some guidelines to estimate and interpret these indicators for other AI benchmarks and competitions. I. INTRODUCTION The evaluation of AI systems has traditionally been done with one system evaluated on one single problem.
To get your morning started right, take a look at the latest stories published since yestereday. Science fiction has moved one step closer to science fact, with Microsoft's launch of a universal translator, an application that allows conversations between groups of people speaking different languages to be translated in real time. Tagged In Smartphone Microsoft Star Trek App Store (i OS) Android (operating System) Artificial Intelligence London Spanish Language Skype German Language Google Play Refugee Afrikaans Chinese Language Welsh Language Human Trafficking Universal Translator Klingon Language Samoan Language Matthew Tostevin/Reuters The Minecraft video game is popular with children; now a version is being used to test artificial-intelligence programs. The Minecraft video game was familiar to José Hernández-Orallo long before he started using it for his own research. The computer scientist, who devises ways to benchmark machine intelligence at the Polytechnic University of Valencia in Spain, first watched his own children play inside the 3D virtual world, which focuses on solving problems rather than... Tagged In Smartphone Minecraft Virtual World Microsoft 3D Computer Graphics San Francisco Web Browser Video Game California Seattle Artificial Intelligence Robotics London Spain University Of Washington Elon Musk Artificial Neural Network Cambridge Microsoft Research Chief Technology Officer Turing Test Visual System Atari Valencia Computing Bloomberg Satya Nadella, chief executive officer of Microsoft Corp., speaks during a keynote session at the Microsoft Developers Build Conference in San Francisco, California, U.S., on Wednesday, March 30, 2016.Microsoft Corp. said it will release a translation app for Windows and other operating systems from rivals Apple Inc. and Google, part of a broader race between the world's largest technology companies for supremacy in the field of artificial intelligence.