Inductive Learning
Reinforcement and Imitation Learning via Interactive No-Regret Learning
Ross, Stephane, Bagnell, J. Andrew
Recent work has demonstrated that problems-- particularly imitation learning and structured prediction-- where a learner's predictions influence the input-distribution it is tested on can be naturally addressed by an interactive approach and analyzed using no-regret online learning. These approaches to imitation learning, however, neither require nor benefit from information about the cost of actions. We extend existing results in two directions: first, we develop an interactive imitation learning approach that leverages cost information; second, we extend the technique to address reinforcement learning. The results provide theoretical support to the commonly observed successes of online approximate policy iteration. Our approach suggests a broad new family of algorithms and provides a unifying view of existing techniques for imitation and reinforcement learning.
HC-Search: A Learning Framework for Search-based Structured Prediction
Doppa, J.R., Fern, A., Tadepalli, P.
Structured prediction is the problem of learning a function that maps structured inputs to structured outputs. Prototypical examples of structured prediction include part-of-speech tagging and semantic segmentation of images. Inspired by the recent successes of search-based structured prediction, we introduce a new framework for structured prediction called HC-Search. Given a structured input, the framework uses a search procedure guided by a learned heuristic H to uncover high quality candidate outputs and then employs a separate learned cost function C to select a final prediction among those outputs. The overall loss of this prediction architecture decomposes into the loss due to H not leading to high quality outputs, and the loss due to C not selecting the best among the generated outputs. Guided by this decomposition, we minimize the overall loss in a greedy stage-wise manner by first training H to quickly uncover high quality outputs via imitation learning, and then training C to correctly rank the outputs generated via H according to their true losses. Importantly, this training procedure is sensitive to the particular loss function of interest and the time-bound allowed for predictions. Experiments on several benchmark domains show that our approach significantly outperforms several state-of-the-art methods.
Discriminatively Reranking Abductive Proofs for Plan Recognition
Wiseman, Sam (Harvard University) | Shieber, Stuart (Harvard University)
We investigate the use of a simple, discriminative reranking approach to plan recognition in an abductive setting. In contrast to recent work, which attempts to model abductive plan recognition using various formalisms that integrate logic and graphical models (such as Markov Logic Networks or Bayesian Logic Programs), we instead advocate a simpler, more flexible approach in which plans found through an abductive beam-search are discriminatively scored based on arbitrary features. We show that this approach performs well even with relatively few positive training examples, and we obtain state-of-the-art results on two abductive plan recognition datasets, outperforming more complicated systems.
MUM: A Technique for Maximising the Utility of Macro-operators by Constrained Generation and Use
Chrpa, Lukรกลก (University of Huddersfield) | Vallati, Mauro (University of Huddersfield) | McCluskey, Thomas Leo (University of Huddersfield)
Research into techniques that reformulate problems to make general solvers more efficiently derive solutions has attracted much attention, in particular when the reformulation process is to some degree solver and domain independent. There are major challenges to overcome when applying such techniques to automated planning, however: reformulation methods such as adding macro-operators (macros, for short) can be detrimental because they tend to increase branching factors during solution search, while other methods such as learning entanglements can limit a planner's space of potentially solvable problems (its coverage) through over-pruning. These techniques may therefore work well with some domain-problem-planner combinations, but work poorly with others. In this paper we introduce a new learning technique (MUM) for synthesising macros from training example plans in order to improve the speed and coverage of domain independent automated planning engines. MUM embodies domain โ independent constraints for selecting macro candidates, for generating macros, and for limiting the size of the grounding set of learned macros, therefore maximising the utility of used macros. Our empirical results with IPC benchmark domains and a range of state of the art planners demonstrate the advance that MUM makes to the increased coverage and efficiency of the planners. Comparisons with a previous leading macro learning mechanism further demonstrate MUM's capability.
Adaptive Stochastic Alternating Direction Method of Multipliers
Zhao, Peilin, Yang, Jinwei, Zhang, Tong, Li, Ping
The Alternating Direction Method of Multipliers (ADMM) has been studied for years. The traditional ADMM algorithm needs to compute, at each iteration, an (empirical) expected loss function on all training examples, resulting in a computational complexity proportional to the number of training examples. To reduce the time complexity, stochastic ADMM algorithms were proposed to replace the expected function with a random loss function associated with one uniformly drawn example plus a Bregman divergence. The Bregman divergence, however, is derived from a simple second order proximal function, the half squared norm, which could be a suboptimal choice. In this paper, we present a new family of stochastic ADMM algorithms with optimal second order proximal functions, which produce a new family of adaptive subgradient methods. We theoretically prove that their regret bounds are as good as the bounds which could be achieved by the best proximal function that can be chosen in hindsight. Encouraging empirical results on a variety of real-world datasets confirm the effectiveness and efficiency of the proposed algorithms.
Box Drawings for Learning with Imbalanced Data
Goh, Siong Thye, Rudin, Cynthia
The vast majority of real world classification problems are imbalanced, meaning there are far fewer data from the class of interest (the positive class) than from other classes. We propose two machine learning algorithms to handle highly imbalanced classification problems. The classifiers are disjunctions of conjunctions, and are created as unions of parallel axis rectangles around the positive examples, and thus have the benefit of being interpretable. The first algorithm uses mixed integer programming to optimize a weighted balance between positive and negative class accuracies. Regularization is introduced to improve generalization performance. The second method uses an approximation in order to assist with scalability. Specifically, it follows a characterize then discriminate approach, where the positive class is characterized first by boxes, and then each box boundary becomes a separate discriminative classifier. This method has the computational advantages that it can be easily parallelized, and considers only the relevant regions of feature space.
An Easy to Use Repository for Comparing and Improving Machine Learning Algorithm Usage
Smith, Michael R., White, Andrew, Giraud-Carrier, Christophe, Martinez, Tony
The results from most machine learning experiments are used for a specific purpose and then discarded. This results in a significant loss of information and requires rerunning experiments to compare learning algorithms. This also requires implementation of another algorithm for comparison, that may not always be correctly implemented. By storing the results from previous experiments, machine learning algorithms can be compared easily and the knowledge gained from them can be used to improve their performance. The purpose of this work is to provide easy access to previous experimental results for learning and comparison. These stored results are comprehensive -- storing the prediction for each test instance as well as the learning algorithm, hyperparameters, and training set that were used. Previous results are particularly important for meta-learning, which, in a broad sense, is the process of learning from previous machine learning results such that the learning process is improved. While other experiment databases do exist, one of our focuses is on easy access to the data. We provide meta-learning data sets that are ready to be downloaded for meta-learning experiments. In addition, queries to the underlying database can be made if specific information is desired. We also differ from previous experiment databases in that our databases is designed at the instance level, where an instance is an example in a data set. We store the predictions of a learning algorithm trained on a specific training set for each instance in the test set. Data set level information can then be obtained by aggregating the results from the instances. The instance level information can be used for many tasks such as determining the diversity of a classifier or algorithmically determining the optimal subset of training instances for a learning algorithm.
Active Semi-Supervised Learning Using Sampling Theory for Graph Signals
Gadde, Akshay, Anis, Aamir, Ortega, Antonio
We consider the problem of offline, pool-based active semi-supervised learning on graphs. This problem is important when the labeled data is scarce and expensive whereas unlabeled data is easily available. The data points are represented by the vertices of an undirected graph with the similarity between them captured by the edge weights. Given a target number of nodes to label, the goal is to choose those nodes that are most informative and then predict the unknown labels. We propose a novel framework for this problem based on our recent results on sampling theory for graph signals. A graph signal is a real-valued function defined on each node of the graph. A notion of frequency for such signals can be defined using the spectrum of the graph Laplacian matrix. The sampling theory for graph signals aims to extend the traditional Nyquist-Shannon sampling theory by allowing us to identify the class of graph signals that can be reconstructed from their values on a subset of vertices. This approach allows us to define a criterion for active learning based on sampling set selection which aims at maximizing the frequency of the signals that can be reconstructed from their samples on the set. Experiments show the effectiveness of our method.
Clustering Spectral Filters for Extensible Feature Extraction in Musical Instrument Classification
Donnelly, Patrick (Montana State University) | Sheppard, John (Montana State University)
We propose a technique of training models for feature extraction using prior expectation of regions of importance in an instrument's timbre. Over a dataset of training examples, we extract significant spectral peaks, calculate their ratio to fundamental frequency, and use $k$-means clustering to identify a set of windows of spectral prominence for each instrument. These windows are used to extract amplitude values from training data to use as features in classification tasks. We test this approach on two databases of 17 instruments, cross evaluate between datasets, and compare with MFCC features.
Perceptron-like Algorithms and Generalization Bounds for Learning to Rank
Chaudhuri, Sougata, Tewari, Ambuj
Learning to rank is a supervised learning problem where the output space is the space of rankings but the supervision space is the space of relevance scores. We make theoretical contributions to the learning to rank problem both in the online and batch settings. First, we propose a perceptron-like algorithm for learning a ranking function in an online setting. Our algorithm is an extension of the classic perceptron algorithm for the classification problem. Second, in the setting of batch learning, we introduce a sufficient condition for convex ranking surrogates to ensure a generalization bound that is independent of number of objects per query. Our bound holds when linear ranking functions are used: a common practice in many learning to rank algorithms. En route to developing the online algorithm and generalization bound, we propose a novel family of listwise large margin ranking surrogates. Our novel surrogate family is obtained by modifying a well-known pairwise large margin ranking surrogate and is distinct from the listwise large margin surrogates developed using the structured prediction framework. Using the proposed family, we provide a guaranteed upper bound on the cumulative NDCG (or MAP) induced loss under the perceptron-like algorithm. We also show that the novel surrogates satisfy the generalization bound condition.