Inductive Learning
Learning Lexicographic Preference Trees From Positive Examples
Fargier, Hélène (IRIT, CNRS, University of Toulouse) | Gimenez, Pierre-François (IRIT, CNRS, University of Toulouse) | Mengin, Jérôme (IRIT, CNRS, University of Toulouse)
This paper considers the task of learning the preferences of users on a combinatorial set of alternatives, as it can be the case for example with online configurators. In many settings, what is available to the learner is a set of positive examples of alternatives that have been selected during past interactions. We propose to learn a model of the users' preferences that ranks previously chosen alternatives as high as possible. In this paper, we study the particular task of learning conditional lexicographic preferences. We present an algorithm to learn several classes of lexicographic preference trees, prove convergence properties of the algorithm, and experiment on both synthetic data and on a real-world bench in the domain of recommendation in interactive configuration.
Learning to Rank Based on Analogical Reasoning
Fahandar, Mohsen Ahmadi (Paderborn University) | Hüllermeier, Eyke (Paderborn University)
Object ranking or "learning to rank" is an important problem in the realm of preference learning. On the basis of training data in the form of a set of rankings of objects represented as feature vectors, the goal is to learn a ranking function that predicts a linear order of any new set of objects. In this paper, we propose a new approach to object ranking based on principles of analogical reasoning. More specifically, our inference pattern is formalized in terms of so-called analogical proportions and can be summarized as follows: Given objects A,B,C,D, if object A is known to be preferred to B, and C relates to D as A relates to B, then C is (supposedly) preferred to D. Our method applies this pattern as a main building block and combines it with ideas and techniques from instance-based learning and rank aggregation. Based on first experimental results for data sets from various domains (sports, education, tourism, etc.), we conclude that our approach is highly competitive. It appears to be specifically interesting in situations in which the objects are coming from different subdomains, and which hence require a kind of knowledge transfer.
Learning From Semi-Supervised Weak-Label Data
Dong, Hao-Chen (Nanjing University) | Li, Yu-Feng (Nanjing University) | Zhou, Zhi-Hua (Nanjing University)
Multi-label learning deals with data objects associated with multiple labels simultaneously. Previous studies typically assume that for each instance, the full set of relevant labels associated with each training instance is given. In many applicationssuch as image annotation, however, it’s usually difficult to get the full label set for each instance and only a partial or even empty set of relevant labels is available. We call this kind of problem as "semi-supervised weak-label learning" problem. In this work we propose the SSWL (Semi-Supervised Weak-Label) method to address this problem. Both instance similarity and label similarity are considered for the complement of missing labels. Ensemble of multiple models are utilized to improve the robustness when label information is insufficient. We formulate the objective as a bi-convex optimization problem with an efficient block coordinate descent algorithm. Experiments validate the effectiveness of SSWL.
ARC: Adversarial Robust Cuts for Semi-Supervised and Multi-Label Classification
Behpour, Sima (University of Illinois at Chicago) | Xing, Wei (University of Illinois at Chicago) | Ziebart, Brian D. (University of Illinois at Chicago)
Many structured prediction tasks arising in computer vision and natural language processing tractably reduce to making minimum cost cuts in graphs with edge weights learned using maximum margin methods. Unfortunately, the hinge loss used to construct these methods often provides a particularly loose bound on the loss function of interest (e.g., the Hamming loss). We develop Adversarial Robust Cuts (ARC), an approach that poses the learning task as a minimax game between predictor and "label approximator" based on minimum cost graph cuts. Unlike maximum margin methods, this game-theoretic perspective always provides meaningful bounds on the Hamming loss. We conduct multi-label and semi-supervised binary prediction experiments that demonstrate the benefits of our approach.
Model-Free Iterative Temporal Appliance Discovery for Unsupervised Electricity Disaggregation
Valovage, Mark (Computer Science &) | Shekhawat, Akshay (Engineering, University of Minnesota, Minneapolis) | Gini, Maria (Computer Science &)
Electricity disaggregation identifies individual appliances from one or more aggregate data streams and has immense potential to reduce residential and commercial electrical waste. Since supervised learning methods rely on meticulously labeled training samples that are expensive to obtain, unsupervised methods show the most promise for wide-spread application. However, unsupervised learning methods previously applied to electricity disaggregation suffer from critical limitations. This paper introduces the concept of iterative appliance discovery, a novel unsupervised disaggregation method that progressively identifies the "easiest to find" or "most likely" appliances first. Once these simpler appliances have been identified, the computational complexity of the search space can be significantly reduced, enabling iterative discovery to identify more complex appliances. We test iterative appliance discovery against an existing competitive unsupervised method using two publicly available datasets. Results using different sampling rates show iterative discovery has faster runtimes and produces better accuracy. Furthermore, iterative discovery does not require prior knowledge of appliance characteristics and demonstrates unprecedented scalability to identify long, overlapped sequences that other unsupervised learning algorithms cannot.
Warmstarting of Model-Based Algorithm Configuration
Lindauer, Marius (University of Freiburg) | Hutter, Frank (University of Freiburg)
The performance of many hard combinatorial problem solvers depends strongly on their parameter settings, and since manual parameter tuning is both tedious and suboptimal the AI community has recently developed several algorithm configuration (AC) methods to automatically address this problem. While all existing AC methods start the configuration process of an algorithm A from scratch for each new type of benchmark instances, here we propose to exploit information about A's performance on previous benchmarks in order to warmstart its configuration on new types of benchmarks. We introduce two complementary ways in which we can exploit this information to warmstart AC methods based on a predictive model. Experiments for optimizing a flexible modern SAT solver on twelve different instance sets show that our methods often yield substantial speedups over existing AC methods (up to 165-fold) and can also find substantially better configurations given the same compute budget.
Detection of Adversarial Training Examples in Poisoning Attacks through Anomaly Detection
Paudice, Andrea, Muñoz-González, Luis, Gyorgy, Andras, Lupu, Emil C.
Machine learning has become an important component for many systems and applications including computer vision, spam filtering, malware and network intrusion detection, among others. Despite the capabilities of machine learning algorithms to extract valuable information from data and produce accurate predictions, it has been shown that these algorithms are vulnerable to attacks. Data poisoning is one of the most relevant security threats against machine learning systems, where attackers can subvert the learning process by injecting malicious samples in the training data. Recent work in adversarial machine learning has shown that the so-called optimal attack strategies can successfully poison linear classifiers, degrading the performance of the system dramatically after compromising a small fraction of the training dataset. In this paper we propose a defence mechanism to mitigate the effect of these optimal poisoning attacks based on outlier detection. We show empirically that the adversarial examples generated by these attack strategies are quite different from genuine points, as no detectability constrains are considered to craft the attack. Hence, they can be detected with an appropriate pre-filtering of the training dataset.
Police: Man put dismembered wife in suitcase, set it ablaze
LOS ANGELES – Investigators believe a homeless man killed his wife in an abandoned restaurant, chopped up her body, stuffed it into a suitcase and then calmly rode with it aboard a train before he burned her remains in a parking lot, Los Angeles police said Tuesday. After Valentino Gutierrez killed his wife last week in a shuttered restaurant in Pasadena, he dismembered her body, stuffed her remains into a large suitcase and boarded a light-rail train at a nearby station, Deputy Chief Justin Eisenberg said. Gutierrez, 56, who was charged Thursday with murder and arson, didn't draw any suspicion on the train and hopped aboard his bicycle after he exited the train. With the suitcase in tow, he peddled from a train station to the parking lot of a Home Depot in Los Angeles, where he set the suitcase ablaze. Detectives still haven't identified a motive in the case and coroner's officials have been unable to identify the burned remains.
Supervised Learning with Python – Towards Data Science
The future of planet Earth is Artificial Intelligence / Machine Learning. Anyone who does not understand it will soon find themselves left behind. Waking up in this world full of innovation feels more and more like magic. There are many kinds of implementations and techniques to carry out Artificial Intelligence and Machine Learning to solve real-time problems, out of which Supervised Learning is one of the most used approaches. In supervised learning, we start with importing dataset containing training attributes and the target attributes.