Overview
Transfer Learning using Kolmogorov Complexity: Basic Theory and Empirical Evaluations
In transfer learning we aim to solve new problems using fewer examples using information gained from solving related problems. Transfer learning has been successful in practice, and extensive PAC analysis of these methods has been developed. However it is not yet clear how to define relatedness between tasks. This is considered as a major problem as it is conceptually troubling and it makes it unclear how much information to transfer and when and how to transfer it. In this paper we propose to measure the amount of information one task contains about another using conditional Kolmogorov complexity between the tasks. We show how existing theory neatly solves the problem of measuring relatedness and transferring the'right' amount of information in sequential transfer learning in a Bayesian setting. The theory also suggests that, in a very formal and precise sense, no other reasonable transfer method can do much better than our Kolmogorov Complexity theoretic transfer method, and that sequential transfer is always justified. We also develop a practical approximation to the method and use it to transfer information between 8 arbitrarily chosen databases from the UCI ML repository.
Transfer Learning using Kolmogorov Complexity: Basic Theory and Empirical Evaluations
In transfer learning we aim to solve new problems using fewer examples using information gained from solving related problems. Transfer learning has been successful in practice, and extensive PAC analysis of these methods has been developed. Howeverit is not yet clear how to define relatedness between tasks. This is considered as a major problem as it is conceptually troubling and it makes it unclear how much information to transfer and when and how to transfer it. In this paper we propose to measure the amount of information one task contains about another using conditional Kolmogorov complexity between the tasks. We show how existing theory neatly solves the problem of measuring relatedness and transferring the'right' amount of information in sequential transfer learning in a Bayesian setting. The theory also suggests that, in a very formal and precise sense, no other reasonable transfer method can do much better than our Kolmogorov Complexity theoretic transfer method, and that sequential transfer is always justified. Wealso develop a practical approximation to the method and use it to transfer information between 8 arbitrarily chosen databases from the UCI ML repository.
A Randomized Algorithm for Large Scale Support Vector Learning
Kumar, Krishnan, Bhattacharya, Chiru, Hariharan, Ramesh
This paper investigates the application of randomized algorithms for large scale SVM learning. The key contribution of the paper is to show that, by using ideas random projections, the minimal number of support vectors required to solve almost separableclassification problems, such that the solution obtained is near optimal with a very high probability, is given by O(log n); if on removal of properly chosenO(log n) points the data becomes linearly separable then it is called almost separable. The second contribution is a sampling based algorithm, motivated fromrandomized algorithms, which solves a SVM problem by considering subsets of the dataset which are greater in size than the number of support vectors for the problem. These two ideas are combined to obtain an algorithm for SVM classification problems which performs the learning by considering only O(log n) points at a time. Experiments done on synthetic and real life datasets show that the algorithm does scale up state of the art SVM solvers in terms of memory required and execution time without loss in accuracy. It is to be noted that the algorithm presented here nicely complements existing large scale SVM learning approaches as it can be used to scale up any SVM solver.
A learning framework for nearest neighbor search
Cayton, Lawrence, Dasgupta, Sanjoy
Can we leverage learning techniques to build a fast nearest-neighbor (NN) retrieval data structure? We present a general learning framework for the NN problem in which sample queries are used to learn the parameters of a data structure that minimize the retrieval time and/or the miss rate. We explore the potential of this novel framework through two popular NN data structures: KD-trees and the rectilinear structures employed by locality sensitive hashing. We derive a generalization theory for these data structure classes and present simple learning algorithms for both. Experimental results reveal that learning often improves on the already strong performance of these data structures.
AND/OR Multi-Valued Decision Diagrams (AOMDDs) for Graphical Models
Mateescu, R., Dechter, R., Marinescu, R.
Inspired by the recently introduced framework of AND/OR search spaces for graphical models, we propose to augment Multi-Valued Decision Diagrams (MDD) with AND nodes, in order to capture function decomposition structure and to extend these compiled data structures to general weighted graphical models (e.g., probabilistic models). We present the AND/OR Multi-Valued Decision Diagram (AOMDD) which compiles a graphical model into a canonical form that supports polynomial (e.g., solution counting, belief updating) or constant time (e.g. equivalence of graphical models) queries. We provide two algorithms for compiling the AOMDD of a graphical model. The first is search-based, and works by applying reduction rules to the trace of the memory intensive AND/OR search algorithm. The second is inference-based and uses a Bucket Elimination schedule to combine the AOMDDs of the input functions via the the APPLY operator. For both algorithms, the compilation time and the size of the AOMDD are, in the worst case, exponential in the treewidth of the graphical model, rather than pathwidth as is known for ordered binary decision diagrams (OBDDs). We introduce the concept of semantic treewidth, which helps explain why the size of a decision diagram is often much smaller than the worst case bound. We provide an experimental evaluation that demonstrates the potential of AOMDDs.
Statistical Learning Theory: Models, Concepts, and Results
von Luxburg, Ulrike, Schoelkopf, Bernhard
Statistical learning theory provides the theoretical basis for many of today's machine learning algorithms and is arguably one of the most beautifully developed branches of artificial intelligence in general. It originated in Russia in the 1960s and gained wide popularity in the 1990s following the development of the so-called Support Vector Machine (SVM), which has become a standard tool for pattern recognition in a variety of domains ranging from computer vision to computational biology. Providing the basis of new learning algorithms, however, was not the only motivation for developing statistical learning theory. It was just as much a philosophical one, attempting to answer the question of what it is that allows us to draw valid conclusions from empirical data. In this article we attempt to give a gentle, nontechnical overview over the key ideas and insights of statistical learning theory. We do not assume that the reader has a deep background in mathematics, statistics, or computer science. Given the nature of the subject matter, however, some familiarity with mathematical concepts and notations and some intuitive understanding of basic probability is required. There exist many excellent references to more technical surveys of the mathematics of statistical learning theory: the monographs by one of the founders of statistical learning theory (Vapnik, 1995, Vapnik, 1998), a brief overview over statistical learning theory in Section 5 of Schölkopf and Smola (2002), more technical overview papers such as Bousquet et al. (2003), Mendelson (2003), Boucheron et al. (2005), Herbrich and Williamson (2002), and the monograph Devroye et al. (1996).
AAAI 2008 Spring Symposia Reports
Balduccini, Marcello (Eastman Kodak Company) | Baral, Chitta (Arizona State University) | Brodaric, Boyan (Geological Survey of Canada) | Colton, Simon (Imperial College, London) | Fox, Peter (National Center for Atmospheric Research) | Gutelius, David (SRI International) | Hinkelmann, Knut (University of Applied Sciences Northwestern Switzerland) | Horswill, Ian (Northwestern University) | Huberman, Bernardo (HP Labs) | Hudlicka, Eva (Psychometrix Associates) | Lerman, Kristina (USC Information Sciences Institute) | Lisetti, Christine (Florida International University) | McGuinness, Deborah L. (Rensselaer Polytechnic Institute) | Maher, Mary Lou (National Science Foundation) | Musen, Mark A. (Stanford University) | Sahami, Mehran (Stanford University) | Sleeman, Derek (University of Aberdeen) | Thönssen, Barbara (University of Applied Sciences Northwestern Switzerland) | Velasquez, Juan D. (MIT CSAIL) | Ventura, Dan (Brigham Young University)
The Association for the Advancement of Artificial Intelligence (AAAI) was pleased to present the AAAI 2008 Spring Symposium Series, held Wednesday through Friday, March 26–28, 2008 at Stanford University, California. The titles of the eight symposia were as follows: (1) AI Meets Business Rules and Process Management, (2) Architectures for Intelligent Theory-Based Agents, (3) Creative Intelligent Systems, (4) Emotion, Personality, and Social Behavior, (5) Semantic Scientific Knowledge Integration, (6) Social Information Processing, (7) Symbiotic Relationships between Semantic Web and Knowledge Engineering, (8) Using AI to Motivate Greater Participation in Computer Science The goal of the AI Meets Business Rules and Process Management AAAI symposium was to investigate the various approaches and standards to represent business rules, business process management and the semantic web with respect to expressiveness and reasoning capabilities. The focus of the Architectures for Intelligent Theory-Based Agents AAAI symposium was the definition of architectures for intelligent theory-based agents, comprising languages, knowledge representation methodologies, reasoning algorithms, and control loops. The Creative Intelligent Systems Symposium included five major discussion sessions and a general poster session (in which all contributing papers were presented). The purpose of this symposium was to explore the synergies between creative cognition and intelligent systems. The goal of the Emotion, Personality, and Social Behavior symposium was to examine fundamental issues in affect and personality in both biological and artificial agents, focusing on the roles of these factors in mediating social behavior. The Semantic Scientific Knowledge Symposium was interested in bringing together the semantic technologies community with the scientific information technology community in an effort to build the general semantic science information community. The Social Information Processing's goal was to investigate computational and analytic approaches that will enable users to harness the efforts of large numbers of other users to solve a variety of information processing problems, from discovering high-quality content to managing common resources. The goal of the Symbiotic Relationships between the Semantic Web and Software Engineering symposium was to explore how the lessons learned by the knowledge-engineering community over the past three decades could be applied to the bold research agenda of current workers in semantic web technologies. The purpose of the Using AI to Motivate Greater Participation in Computer Science symposium was to identify ways that topics in AI may be used to motivate greater student participation in computer science by highlighting fun, engaging, and intellectually challenging developments in AI-related curriculum at a number of educational levels. Technical reports of the symposia were published by AAAI Press.
Solving Multiagent Networks Using Distributed Constraint Optimization
Pearce, Jonathan P. (JP Morgan Chase) | Tambe, Milind (University of Southern California) | Maheswaran, Rajiv (University of Southern California)
In many cooperative multiagent domains, the effect of local interactions between agents can be compactly represented as a network structure. Given that agents are spread across such a network, agents directly interact only with a small group of neighbors. A distributed constraint optimization problem (DCOP) is a useful framework to reason about such networks of agents. Given agents’ inability to communicate and collaborate in large groups in such networks, we focus on an approach called k-optimality for solving DCOPs. In this approach, agents form groups of one or more agents until no group of k or fewer agents can possibly improve the DCOP solution; we define this type of local optimum, and any algorithm guaranteed to reach such a local optimum, as k-optimal. The article provides an overview of three key results related to koptimality. The first set of results gives worst-case guarantees on the solution quality of k-optima in a DCOP. These guarantees can help determine an appropriate k-optimal algorithm, or possibly an appropriate constraint graph structure, for agents to use in situations where the cost of coordination between agents must be weighed against the quality of the solution reached. The second set of results gives upper bounds on the number of k-optima that can exist in a DCOP. These results are useful in domains where a DCOP must generate a set of solutions rather than a single solution. Finally, we sketch algorithms for k-optimality and provide some experimental results for 1-, 2- and 3-optimal algorithms for several types of DCOPs.
Online Planning Algorithms for POMDPs
Ross, S., Pineau, J., Paquet, S., Chaib-draa, B.
Partially Observable Markov Decision Processes (POMDPs) provide a rich framework for sequential decision-making under uncertainty in stochastic domains. However, solving a POMDP is often intractable except for small problems due to their complexity. Here, we focus on online approaches that alleviate the computational complexity by computing good local policies at each decision step during the execution. Online algorithms generally consist of a lookahead search to find the best action to execute at each time step in an environment. Our objectives here are to survey the various existing online POMDP methods, analyze their properties and discuss their advantages and disadvantages; and to thoroughly evaluate these online approaches in different environments under various metrics (return, error bound reduction, lower bound improvement). Our experimental results indicate that state-of-the-art online heuristic search methods can handle large POMDP domains efficiently.