Industry
Housing Markets with Indifferences: A Tale of Two Mechanisms
Aziz, Haris (Technische Universität München) | Keijzer, Bart de (Centrum Wiskunde Informatica)
The (Shapley-Scarf) housing market is a well-studied and fundamental model of an exchange economy. Each agent owns a single house and the goal is to reallocate the houses to the agents in a mutually beneficial and stable manner. Recently, Alcalde-Unzu and Molis (2011) and Jaramillo and Manjunath (2011) independently examined housing markets in which agents can express indifferences among houses. They proposed two important families of mechanisms, known as TTAS and TCR respectively. We formulate a family of mechanisms which not only includes TTAS and TCR but also satisfies many desirable properties of both families. As a corollary, we show that TCR is strict core selecting (if the strict core is non-empty). Finally, we settle an open question regarding the computational complexity of the TTAS mechanism. Our study also raises a number of interesting research questions.
Online Kernel Selection: Algorithms and Evaluations
Yang, Tianbao (Michigan State University) | Mahdavi, Mehrdad (Michigan State University) | Jin, Rong (Michigan State University) | Yi, Jinfeng (Michigan State University) | Hoi, Steven C.H. (Nanyang Technological University)
Kernel methods have been successfully applied to many machine learning problems. Nevertheless, since the performance of kernel methods depends heavily on the type of kernels being used, identifying good kernels among a set of given kernels is important to the success of kernel methods. A straightforward approach to address this problem is cross-validation by training a separate classifier for each kernel and choosing the best kernel classifier out of them. Another approach is Multiple Kernel Learning (MKL), which aims to learn a single kernel classifier from an optimal combination of multiple kernels. However, both approaches suffer from a high computational cost in computing the full kernel matrices and in training, especially when the number of kernels or the number of training examples is very large. In this paper, we tackle this problem by proposing an efficient online kernel selection algorithm. It incrementally learns a weight for each kernel classifier. The weight for each kernel classifier can help us to select a good kernel among a set of given kernels. The proposed approach is efficient in that (i) it is an online approach and therefore avoids computing all the full kernel matrices before training; (ii) it only updates a single kernel classifier each time by a sampling technique and therefore saves time on updating kernel classifiers with poor performance; (iii) it has a theoretically guaranteed performance compared to the best kernel predictor. Empirical studies on image classification tasks demonstrate the effectiveness of the proposed approach for selecting a good kernel among a set of kernels.
A Bregman Divergence Optimization Framework for Ranking on Data Manifold and Its New Extensions
Xu, Bin (Zhejiang University) | Bu, Jiajun (Zhejiang University) | Chen, Chun (Zhejiang University) | Cai, Deng (Zhejiang University)
Recently, graph-based ranking algorithms have received considerable interests in machine learning, computer vision and information retrieval communities. Ranking on data manifold (or manifold ranking, MR) is one of the representative approaches. One of the limitations of manifold ranking is its high computational complexity (O( n 3 ), where n is the number of samples in database). In this paper, we cast the manifold ranking into a Bregman divergence optimization framework under which we transform the original MR to an equivalent optimal kernel matrix learning problem.With this new formulation, two effective and efficient extensions are proposed to enhance the ranking performance. Extensive experimental results on two real world image databases show the effectiveness of the proposed approach.
Learning from Demonstration for Goal-Driven Autonomy
Weber, Ben George (University of California, Santa Cruz) | Mateas, Michael (University of California, Santa Cruz) | Jhala, Arnav (University of California, Santa Cruz)
Goal-driven autonomy (GDA) is a conceptual model for creating an autonomous agent that monitors a set of expectations during plan execution, detects when discrepancies occur, builds explanations for the cause of failures, and formulates new goals to pursue when planning failures arise. While this framework enables the development of agents that can operate in complex and dynamic environments, implementing the logic for each of the subtasks in the model requires substantial domain engineering. We present a method using case-based reasoning and intent recognition in order to build GDA agents that learn from demonstrations. Our approach reduces the amount of domain engineering necessary to implement GDA agents and learns expectations, explanations, and goals from expert demonstrations. We have applied this approach to build an agent for the real-time strategy game StarCraft. Our results show that integrating the GDA conceptual model into the agent greatly improves its win rate.
Name-Ethnicity Classification and Ethnicity-Sensitive Name Matching
Treeratpituk, Pucktada (Pennsylvania State University) | Giles, C. Lee (Pennsylvania State University)
Personal names are important and common information in many data sources, ranging from social networks and news articles to patient records and scientific documents.They are often used as queries for retrieving records and also as key information for linking documents from multiple sources. Matching personal names can be challenging due to variations in spelling and various formatting of names. While many approximated name matching techniques have been proposed, most are generic string-matching algorithms. Unlike other types of proper names, personal names are highly cultural. Many ethnicities have their own unique naming systems and identifiable characteristics. In this paper we explore such relationships between ethnicities and personal names to improve the name matching performance. First, we propose a name-ethnicity classifier based on the multinomial logistic regression. Our model can effectively identify name-ethnicity from personal names in Wikipedia, which we use to define name-ethnicity, to within 85\% accuracy.Next, we propose a novel alignment-based name matching algorithm, based on Smith–Waterman algorithm and logistic regression.Different name matching models are then trained for different name-ethnicity groups.Our preliminary experimental result on DBLP's disambiguated author dataset yields a performance of 99\% precision and 89\% recall.Surprisingly, textual features carry more weight than phonetic ones in name-ethnicity classification.
Hierarchical Double Dirichlet Process Mixture of Gaussian Processes
Tayal, Aditya (University of Waterloo) | Poupart, Pascal (University of Waterloo) | Li, Yuying (University of Waterloo)
We consider an infinite mixture model of Gaussian processes that share mixture components between non-local clusters in data. Meeds and Osindero (2006) use a single Dirichlet process prior to specify a mixture of Gaussian processes using an infinite number of experts. In this paper, we extend this approach to allow for experts to be shared non-locally across the input domain. This is accomplished with a hierarchical double Dirichlet process prior, which builds upon a standard hierarchical Dirichlet process by incorporating local parameters that are unique to each cluster while sharing mixture components between them. We evaluate the model on simulated and real data, showing that sharing Gaussian process components non-locally can yield effective and useful models for richly clustered non-stationary, non-linear data.
Leveraging Domain Knowledge in Multitask Bayesian Network Structure Learning
Oyen, Diane (University of New Mexico) | Lane, Terran (University of New Mexico)
Network structure learning algorithms have aided network discovery in fields such as bioinformatics, neuroscience, ecology and social science. However, challenges remain in learning informative networks for related sets of tasks because the search space of Bayesian network structures is characterized by large basins of approximately equivalent solutions. Multitask algorithms select a set of networks that are near each other in the search space, rather than a score-equivalent set of networks chosen from independent regions of the space. This selection preference allows a domain expert to see only differences supported by the data. However, the usefulness of these algorithms for scientific datasets is limited because existing algorithms naively assume that all pairs of tasks are equally related. We introduce a framework that relaxes this assumption by incorporating domain knowledge about task-relatedness into the learning objective. Using our framework, we introduce the first multitask Bayesian network algorithm that leverages domain knowledge about the relatedness of tasks. We use our algorithm to explore the effect of task-relatedness on network discovery and show that our algorithm learns networks that are closer to ground truth than naive algorithms and that our algorithm discovers patterns that are interesting.
Learning Behavior Models for Hybrid Timed Systems
Niggemann, Oliver (Ostwestfalen-Lippe, University of Applied Sciences) | Stein, Benno (Bauhaus-Universität Weimar) | Vodencarevic, Asmir (University of Paderborn) | Maier, Alexander (Ostwestfalen-Lippe, University of Applied Sciences) | Büning, Hans Kleine (University of Paderborn)
A tailored model of a system is the prerequisite for various analysis tasks, such as anomaly detection, fault identification, or quality assurance. This paper deals with the algorithmic learning of a system’s behavior model given a sample of observations. In particular, we consider real-world production plants where the learned model must capture timing behavior, dependencies between system variables, as well as mode switches—in short: hybrid system’s characteristics. Usually, such model formation tasks are solved by human engineers, entailing the well-known bunch of problems including knowledge acquisition, development cost, or lack of experience. Our contributions to the outlined field are as follows. (1) We present a taxonomy of learning problems related to model formation tasks. As a result, an important open learning problem for the domain of production system is identified: The learning of hybrid timed automata. (2) For this class of models, the learning algorithm HyBUTLA is presented. This algorithm is the first of its kind to solve the underlying model formation problem at scalable precision. (3) We present two case studies that illustrate the usability of this approach in realistic settings. (4) We give a proof for the learning and runtime properties of HyBUTLA.
Design and Optimization of an Omnidirectional Humanoid Walk: A Winning Approach at the RoboCup 2011 3D Simulation Competition
MacAlpine, Patrick (University of Texas at Austin) | Barrett, Samuel (University of Texas at Austin) | Urieli, Daniel (University of Texas at Austin) | Vu, Victor (University of Texas at Austin) | Stone, Peter (University of Texas at Austin)
This paper presents the design and learning architecture for an omnidirectional walk used by a humanoid robot soccer agent acting in the RoboCup 3D simulation environment. The walk, which was originally designed for and tested on an actual Nao robot before being employed in the 2011 RoboCup 3D simulation competition, was the crucial component in the UT Austin Villa team winning the competition in 2011. To the best of our knowledge, this is the first time that robot behavior has been conceived and constructed on a real robot for the end purpose of being used in simulation. The walk is based on a double linear inverted pendulum model, and multiple sets of its parameters are optimized via a novel framework. The framework optimizes parameters for different tasks in conjunction with one another, a little-understood problem with substantial practical significance. Detailed experiments show that the UT Austin Villa agent significantly outperforms all the other agents in the competition with the optimized walk being the key to its success.
Ensemble Feature Weighting Based on Local Learning and Diversity
Li, Yun (Nanjing University of Posts and Telecommunications) | Gao, Suyan (Nanjing University of Posts and Telecommunications) | Chen, Songcan (Nanjing University of Aeronautics and Astronautics)
Recently, besides the performance, the stability (robustness, i.e., the variation in feature selection results due to small changes in the data set) of feature selection is received more attention. Ensemble feature selection where multiple feature selection outputs are combined to yield more robust results without sacrificing the performance is an effective method for stable feature selection. In order to make further improvements of the performance (classification accuracy), the diversity regularized ensemble feature weighting framework is presented, in which the base feature selector is based on local learning with logistic loss for its robustness to huge irrelevant features and small samples. At the same time, the sample complexity of the proposed ensemble feature weighting algorithm is analyzed based on the VC-theory. The experiments on different kinds of data sets show that the proposed ensemble method can achieve higher accuracy than other ensemble ones and other stable feature selection strategy (such as sample weighting) without sacrificing stability